THE TWO WEIGHT INEQUALITY FOR THE HILBERT TRANSFORM: A PRIMER 
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_ ■ ' Abstract. Given a pair of weights w, a, the two weight inequality for the Hilbert transform is of 

^^ , the form ||H(fff)||L2(-„,) < ||f|[L^(iT)- Recent work of Lacey-Sawyer-Shen-Uriarte-Tuero and Lacey have 

^vj ' established a conjecture of Nazarov-Treil-Volberg, giving a real-variable characterization of which pairs 
of weights this inequality holds, provided the pair of weights do not share a common point mass. In 

C^ , this paper, the characterization is proved, collecting details from across several papers; compactness is 

^M • characterized; all relevant estimates are proved; counterexamples are details; and areas of application are 

^^ ' indicated. 
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c^ ■ 1. Introduction 

By a weight we mean a non-negative Borel locally finite measure, typically on R. We consider the 
two weight inequality for the Hilbert transform for a pair of weights w, a on R: 

(1.1) SUp||H^(f-0-)||L2(^)<N||f||L2(„). 

0<T<1 
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2 MICHAEL T LACEY 

Here, X denotes the best constant in the inequality. And, Ht:y(x), for signed measure y, denotes a 
standard truncation of the usual Hilbert transform, given by 

(1.2) H..(x):=[ ^^. 

The inequality is phrased this way as the familiar principal value need not exist in the settings we are 
interested in. Below, however, we will systematically suppress the truncation parameter t, understanding 
that the relevant inequalities are uniform over < t < 1. In areas of application, there is no a priori 
conditions placed upon the weights that can arise in this question, forcing us to consider the question in 
this form. 

The central question is then a real-variable characterization of the inequality (1.1). In the special 
case that the pair of weights a and w do not share a common point mass, this was supplied in two 
papers, one of Lacey-Sawyer-Shen-Uriarte-Tuero [23], and another of the present author [17], answering 
a beautiful conjecture of Nazarov-Treil-Volberg [54]. 

Theorem 1.3. Suppose that for all x G R, cdx}) • w({x}) = for the pair of weights a, w. Define two 
positive constants Aj and T as the best constants in the inequalities below, uniform over intervals I. 

(1.4) P(a,I)-P(w,I)<yi2, 



(1.5) 
(1.6) 



H(ali)^ dw < T^a(I) , 
H(wli)^ da < T^w(I) . 



There holds 3M" ~ ?{ :- A^^ + T. 

The first condition is an extension of the Muckenhoupt Aj condition to a Poisson condition. The 
exact Poisson extension of a to the upper half-plane is not needed, rather we use the approximation 
below, which is roughly the Poisson extension evaluated at the center of I, and up into the half-plane 
the length of I. 

Jr (|I|-Fdist[x,I))^ 

The remaining conditions are referred to as the Sawyer-type testing conditions, as he first introduced 
these conditions into the two weight setting in his fundamental papers on the maximal function [47], 
and later the fractional and Poisson integral operators [48]. It is well-known that the A2 condition 
is necessary for the two weight inequality, and it is obvious that the testing conditions are necessary. 
Thus, the substance of the result concerns the sufficiency of the A2 and testing inequalities for the norm 
inequality. 

This Theorem is a central result in the non-homogeneous harmonic analysis, as founded in a sequence 
of influential papers of Nazarov-Treil-Volberg [30-32]. 

Note however that the A2 condition trivially fails if the two weights share a common point mass. In 
addition, the proof of the theorem is very involved, encompassing arguments and points of view that 
were spread across several papers [17,22,23,33]. An important additional result. Sawyer's two weight 
inequality for the Poisson integral [48], is not well-represented in the literature. Finally, the interest in 
the two weight inequality is well-motivated by applications to operator theory, model spaces, and spectral 
theory, themselves spread across additional papers. 
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The point of this paper is to 

(a) state and prove the Theorem, in all detail, including a complete proof of Sawyer's two weight Poisson 
inequality; 

(b) supply the characterization of compactness of H^, which is a bit more complicated than one might 
expect; 

(c) give the proof under the influential pivotal condition, which serves to highlight where the difficulties 
arise in the general case; 

(d) collect some relevant counterexamples; 

(e) give complements and extensions of the theorem, and the proof techniques, including a conjecture 
about the case in which the pair of weights share common point massesg; 

(f) and point to areas of applications. 

Sections proceed directly towards proofs, but many conclude with some context and discussion. The 
proof is entirely elementary, assuming only the well known facts about martingale differences. 

1.1. Compactness. A characterization of compactness follows, in terms that are more complicated than 
the 'vanishing conditions' would immediately suspect. An example given at the beginning of §9 shows 
that the vanishing conditions are not necessary. 

Theorem 1.7. Let a, w be two weights on R wliich do not share a common point mass and the two 
weight inequality (1.1) holds. Then, H^ : L^(a) h-> L^(w) is compact if and only if for a// < A < X, 
there is a finite integer N^ such that //"Ii, . . . , In are pairwise disjoint intervals contained in X^, then 
N < N^. Here, X\ := Z^icy^w) is the class of intervals which meet any of these three conditions. 



(1.8) P(w,I)-P(a,I)>A, 
(1.9) 

(1.10) 



H(ali)^ dw> Ao-(I], 
H(wli)^ da> Aw(I). 



1.2. An Overview of the Proof. The result is specific to the Hilbert transform, meaning that particular 
properties of this transform must guide the proof. The elementary examples of these are the monotonicity 
principle. Lemma 3.5, valid for all pairs of weights, and then the energy inequality, Lemma 3.8, valid 
under the assumption of interval testing and the Aj condition. There is a third critical property, the 
functional energy inequality. These properties are a last vestige of positivity: The derivative of the kernel 
is positive. 4-—!— = 7 — ^T > 0. 

The proof strategy is outlined in Figure 1. One begins with the bilinear form (H(yf, g)^. The passage 
to the 'triangular forms' in Theorem 4.4 is a rather standard step in many Tl-type theorems. It only 
depends upon the Az condition imposed on the weights. The two triangular forms are dual to one 
another. There are two steps in the analysis, a 'global to local' reduction in §4, and an analysis of the 
'stopping form' in the §5. To this point, the steps are familiar to experts in the Tl theorem. But carrying 
out these last two steps, however, necessarily depends upon novel techniques. 

Key is a joint analysis of both the weights (c, w) and the pair of functions on which the boundedness 
of the Hilbert transform is being tested. The decomposition of the weights occurs through the energy 
inequality, and the functions by the use of Calderon-Zygmund stopping data, as in Lemma 4.6. (These 
are not included in Figure 1.) This accomplishes three points: (1) the averages off are controlled, (2) 
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(Haf, g). 



Az 



Triangular Forms (4.2) 



global to local 



Local Estimate (4.16) 



testing 



Stopping Form (5.1) 



2 Weight Poisson §8 



Functional Energy §7 



Size Lemma 5.6 



Figure 1. A schematic tree of the proof of the main theorem. 



certain irregularities in the pairs of weights, as expressed through the energy inequality, are accounted 
for, and (3) one has access to the very useful quasi-orthogonality estimate (4.9). 

Controlling the averages of f is essential to the essential 'global to local reduction' in Theorem 4.10. A 
simple appeal to the testing condition, allows an application of the monotonicity principle to rephrase the 
inequality in this Theorem as a certain two-weight inequality for the Poisson integral. In this inequality, 
the Poisson integral maps functions on R to those on R^. The weight on R is, say, a. The weight on 
R^ is then derived from w in a specific fashion. The resulting inequality, called the functional energy 
inequality, is a deep extension of the energy inequality. It then very fortunate that Sawyer has proved 
the two weight inequality for the Poisson integral §8, and only testing conditions need be verified. These 
are then reduced to the Aj conditions, and interval testing for the Hilbert transform. Notably, it is only 
here that the full Poisson A2 condition is used. 

The local term is then dominated by the analysis of the stopping form (5.1). This is again a familiar 
object, to experts in Tl theorem, addressed by ad hoc off-diagonal estimates, which absolutely do not 
apply in the current context. Control of the irregularities of the weights is now the main point, compli- 
cated by the fact that the stopping form is not intrinsically defined. A notion of 'size' is introduced — it 
serves as an approximate of the operator norm of the stopping form. The size Lemma, Lemma 5.6, 
makes this precise: One can decompose a stopping form into constituent parts. Those of large size have 
a simpler form, which allows one to estimate their operator norm by size. What is left has smaller size, 
and so one can recurse. 

Some readers will have noticed that a very common set of objects, Carleson measures, are not men- 
tioned, and indeed, they do not appear in the proof at all. (Induced measures for the Poisson integral 
in the upper half plane are however essential.) The wide spread prevalence of Carleson measures in 
Tl theorems can be traced to two facts, first that associated paraproducts operators are the principle 
obstacle to a simple proof, and second, the paraproduct operators have an essentially canonical form. In 
this theorem, neither of these facts hold, and so we have abandoned the notions of Carleson measures 
and paraproducts. 
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1.2.1. Common Pointmasses. The Aj condition (1.4) will fail if the pair of weights share a common 
point mass. Assuming this is the only essential new difficulty, one arises at the following formulation of 
a conjecture. Define 

(1.11) pi(x)2:- '^' 



I|+dist(x,I])2 ' 



so that P(a, I) = ||Pi|Il2((j)- Suppose that there is an bj which is a point mass for both w and ct, and 
moreover 



pi(bi)V({bi}) > 



and pi(bi)^w({bi}) > I 



pi(x)^ a(dx) 
pi(x)^ w(dx) 



Such a point is unique, if it exists. Set dj = a — a({bi})5bj. If there is no such point bj, set aj = a. 
Do likewise for wj. 

Conjecture 1.12. For two weights a,yv on R, the two weight inequality (1.1) holds if and only if the 
two testing inequalities (1.5) and (1.6) hold, and this modified A2 condition hold: 

(1.13) sup P(6-i,I)P(w,I) + P(wi,I)P(a,I] <oo, 

I an interval 

It is not clear how to modify the existing proof to obtain this conjecture. A subtle part of the definition 
of the modified A2 condition that the point bj can be located at an arbitrary point on the real line. 
This is a difficulty, from the point of view of the proof strategy, that at almost all points needs only the 
'half-Poisson' condition 

(1.14) sup P(alR_i,I)^ + ^P(wlM_i,I)<oo. 

I an interval }'■] ]'■] 

This condition is a consequence of the norm inequality (1.1), even when a and w share a common point 
mass. 

The one place at which the full Poisson A2 condition is needed is in the proof of the functional 
energy inequality, in §7.4. Indeed, to deduce this inequality, we only need a two weight inequality for the 
Poisson inequality 'with holes,' the holes arising from the Calderon-Zygmund stopping data, see §4.1.6. 
Efforts to characterize this inequality have not been successful, and the path of the proof is to 'fill in the 
holes', strongly exploiting the Carleson measure property of the stopping data. Filling in the holes then 
places Sawyer's two weight inequality for the Poisson integral at one's disposal, and the verification of 
the testing inequality in §7.4 then invokes the full Poisson A2 condition, with little hope of addressing 
a problematic common point mass. 

A second concern about the general case is a lack of continuity of the question in the weights, when 
point masses are involved. The simplest example of weights sharing a common point mass is counting 
measure on Z: (y = v^ = Y.nez ^ti- '^ ^'^^ ^^^^ however We = ^^g^ 6n+e, for < e < 1 , note that the 
norm of Hq- from L^(a) to L^(we) is of order e^^ . 

1.3. The A2 Theory. The classical case of an A2 weight corresponds to the case of w(dx) = w(x]dx, 
and w(x) > a.e. Moreover, the weight a also has density given by a(x) := w(x)^^ . It is assumed that 
both w and c are locally integrable, so that they are both weights. Note that w(x) • a(x) = ^. The 
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Muckenhoupt A2 condition asserts that this same equality approximately holds, uniformly over location 
and scale. 

w(I) a(I) 

Ma2 '■= sup -pjq- • -prp < 00 . 

I m m 

This condition is equivalent to the uniform norm bound on L^(w) for the class of simple averaging 
operators 



I 



f dx • li , I is an interval. 



From this condition flows a rich theory, including the boundedness of all Calderon-Zygmund operators. 
The classical result of Hunt-Muckenhoupt-Wheeden [12] states that w in in A2 if and only if the Hilbert 
transform maps L^(w) to L^(w). By a basic change of variables argument, first noted by Sawyer in 
[47], this is equivalent to H^ mapping L^la) to L^(w). Stefanie PetermichI [39] quantified the Hunt- 
Muckenhoupt-Wheeden theorem as follows. 

Theorem A. A weight w G A2 if and only if H is bounded from L^(w) to L^(w), and moreover 

>i~ [w]a2- 

To place this result in the context of our main result, it is classical and easy to see that the Poisson A2 
characteristic satisfies Aj < [w]^ . And, using the remarkable Haar shift representation of the Hilbert 
transform due to PetermichI [38], one can check that the testing condition satisfies T < [w]a2- This is 
what Petermichl's original proof did. A more conceptual approach to this estimate was given in [18]. 
All existing proofs of this fact depend ultimately on known Lebesgue measure estimates for the Hilbert 
transform; the latter are irrelevant for the two weight theorem. 

It is perhaps worth emphasizing that the powerful Haar shift technique of PetermichI, even with its 
impressive extension by Hytonen [13], seems to be of little use in the general two weight problem. There 
are two obstacles: Firstly, in order to use it, one must essentially have control on a Haar shift operator, 
independently of how the grid defining the shift is defined. The resulting condition on the pair of weights 
is more subtle that the two weight inequality for the Hilbert transform. Secondly, one should recover 
the energy inequality of Lemma 3.8. But, the energy of any fixed Haar shift is zero, and indeed, the two 
weight inequality for Haar shift operators [34] has just a few difficulties in its proof. 

By the A2 Theorem, it is meant the linear in A2 bound for all Calderon-Zygmund operators. This 
result, pursued by many, and established by Hytonen [13], has many points of contact with the subject 
of this note. But, we refer the reader to [15] and references there in for more information. 

In the A2 theory, it is essential that w(x) > a.e. Suppose one relaxes this condition to w(x) is 
positive on a measurable set E c R, and define a[x) to be supported on E, and equal to yv[x)^\ One 
can then ask if the Hilbert transform is bounded for this pair of weights, and Theorem 1.3 applies here. 
This question is an instance of the non-homogeneous A2 theory advocated by A. Volberg. Despite the 
specificity in the way the weights are prescribed, it does not seem that any simplifications accrue in the 
proof the main theorem. 

1.4. The Individual Two Weight Problem. Given an operator T, the individual V two weight in- 
equality fori is the inequality 

(1.15) ||T(rf||i_p(w) < 3SfT||f||LP(ff) ■ 
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Here and throughout we use the notation Tjyf := T(o"f). We understand that T applied to a signed 
measure a- f should make sense. And, the inequality above is the preferred form of the inequality as 
duality is expressed in the natural way: The inequality (1.15) is equivalent to 

l|Twg|lLP'(a] < ^T||g|lLP'(w]- 

The question is then to characterize the pairs of weights for which (1.15) holds. 

This specificity of the question is of interest for a few canonical operators, ones for which the corre- 
sponding two weight inequality will naturally present itself. The leading examples of this are, for positive 
operators, the Hardy operator [27], the maximal function. Sawyer's Theorem of 1981 [47] and Sawyer's 
1988 theorem for the fractional integrals [48]. It is noteworthy that the two weight inequalities for the 
Hardy and the Poisson integral are used in the proof of our main theorem. 

It is interesting to that this is not only a chronological list, but it also reflects the depth of the results 
as well. The Hardy operator is easiest, characterized by an 'A2-type condition.' It was Sawyer's insight, 
however, that the maximal function characterization requires a testing condition. The fractional integrals 
are harder still. For the sake of comparison, let us state a special case of the result for the fractional 
integrals in one dimension.^ One can also compare to Theorem G for the Poisson integral. Both results 
give a characterization in terms of testing conditions. And, while we state just one case of the general 
result, one should note that there is no Sobolev condition imposed on the L^ indices. 

Theorem B. For two weights w, a, and < a < 1 , the operator R^f (x) :- J f (x — y) | i^ maps L^(a] 
to L^(w) if and only if the testing inequalities below hold. 



Ra(li)^dw<TV(I), 



Rw(li)^ dCT < T^w(I) . 



i^wlJ-l) 

I 
Moreover the norm of the operator is equivalent to 7. 

The analysis of the individual two weight inequality for positive operators is much simpler, as is the case 
of dyadic operators. For certain non-positive dyadic operators, see the result of Nazarov-Treil-Volberg 
[34]. All of these results have found significant interest, due to the Haar shift operators of PetermichI 
[38] and the remarkable median inequality of Lerner [11], and the Hytonen representation theorem [13]. 

The Hilbert transform is the first non-positive continuous operator for which the individual two weight 
problem has been solved. And, one would only ever expect that the solution would be of interest (or 
even possible) for a few canonical choices of operators, such as Hilbert, Cauchy and Riesz transforms. 
Foundational to the solution for the Hilbert transform is the positivity of the derivative of the kernel. 
No other canonical choice will satisfy such a simple condition. 

The individual two weight question makes sense for any 1 < p < oo, and there are characterizations 
in this, and other off-diagonal cases for positive operators. But, for singular integrals, it does not seem 
that there will be useful characterizations in the case of p ?5: 2. See [20] for some results in this case, for 
maximal truncations of singular integrals. 

1.5. The Hilbert Transform. The two weight inequality for the Hilbert transform was addressed as 
early as 1976 by Muckenhoupt and Wheeden [28].^ But, it received much wider recognition as an 
important problem with the 1988 work of Sarason [46]. The latter was part of important sequence of 



^Besides Sawyer's results, one should also consult Casscante-Ortega-Verbitsky [6]. 

^In particular, they noted that the simple A2 condition was not sufficient for the bounded ness of the Hilbert transform, and 
conjectured that half-Poisson A2 conditions would be sufficient, an indication of the powerful sway held by the Muckenhoupt 
A2 condition in the early years of the weighted theory. 



8 MICHAEL T LACEY 

investigations that identified de Branges spaces as an essential tool in operator theory. His question 
concerning the composition of Toeplitz operators, see §12.1, was raised therein, and advertised again in 
[45] . This question related the individual two weight problem for the Hilbert transform to a profound 
question from operator theory. 

While not stated in the language of the Hilbert transform, Sarason wrote that it was 'tempting' to 
conjecture that the full Poisson A2 condition would be sufficient for the two weight inequality. In an 
important development, F. Nazarov [29] showed that this was not the case. The two weight problem was 
seen to be an important component of Model spaces. In particular, a more delicate counterexample was 
developed by Nazarov-Volberg [35] to disprove a conjectured characterization of the Carleson measures 
in the unit disk for a model space. For more on this see §11. It is worth noting that in Sarason's 
question, the weights have a density |fp, for analytic f, and the subharmonicity could be an important 
part of the problem. But, in the context of model spaces, completely arbitrary measures can arise. In 
one example discussed in detail in §11, one of the weights is uniform measure on a Cantor set. 

Nazarov- Treil-Volberg were creating the field of non-homogeneous Harmonic Analysis, in a series of 
ground-breaking papers [30-32]. Their work, and a revitalization of the perspective of Eric Sawyer from 
the 1980's, lead them to conjecture the characterization proved in this paper. Moreover, their influential 
proof strategy, devised in [33,54], lead to a verification of the conjecture in the case that both weights 
were doubling. This paper uses their strategy, with several additional features. At the same time, their 
approach is generic, in that it applies to general Calderon-Zygmund operators. Specific properties of the 
Hilbert transform had to be used in the characterization. These properties were identified in [17,21-23], 
and the more precise description of what was accomplished at each stage is spread out throughout the 
paper. 

2. Preliminaries 

2.1. Dyadic Grids and Haar Functions. A grid is a collection V of closed intervals so that for all 
I, J G V, I n J = 0,1, J. Further say that D is a dyadic grid if for all integers n, the collection 
{I £ V : |I| = 2^} partitions R, aside from the endpoints of the intervals. 

For a sub collection J^ of a dyadic grid V, set ttj-I to be the minimal element of J^ that contains I, 
which will in the context that I need not be a member of F. Set tt^-I to be the minimal member of J^ 
that strictly contains I, inductively define 7t^ I = 7Tjr(7TjrI). 

Say that the collection V is admissible for weight a if a does not have a point mass at any endpoint 
of an interval I £ V. 

2.2. Haar Functions. Let V be admissible for a be a weight on R. If I G D is such that a assigns 
non-zero weight to both children of I, the associated Haar function is chosen to have a non-negative 
inner product with the independent variable, {x,hf{x))a > 0, a convenient choice due to the central 
role of the energy inequality, (3.9). 



(2.1) hflx) :='<"'-""'-' ^'-"" ■- 



cr(I) \ff(I+) a(I_); 

In this definition, we are identifying an interval with its indicator function, and we will do so throughout 
the remainder of the paper. This is an L^(a)-normalized function, and has cr-integral zero. If a is 
supported only on one child of I, then we set hf = 0. 

For any dyadic interval Iq with o'(Io) > 0, the non-zero functions among {a(Io)^^^^Io} U {hj^ : I G 
P,! C Iq} form an orthonormal basis for L^(Io,a). We will use the notation Lq(Io,o') for the subspace 
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Figure 2. Two Haar functions. For the left function, the weight is nearly equally 
distributed between the two halves of the interval, in sharp contrast to the function on 
the right, in which the weight on the right half is much larger than on the left. 

of L^(Io, o") of functions with mean zero. It has orthonormal basis consisting of the non-zero functions 
in {hf : I G 2?,I C Iq}. These are familiar properties. But, another familiar property, that the positive 
and negative values of hf are comparable in absolute value, fails in a dramatic fashion for non-doubling 
measures. See Figure 2. 

We will use the notations f(I) = (f, liJ^)o-, as well as the equality below, holding for those I with 

Aj'f = (f , hj") ahf = l+Ef^ f + l^Ef_ f - lEj'f . 

This is the familiar martingale difference equality, and so we will refer to AJ^f as a martingale difference. 
It implies the familiar telescoping identity E?f = ^j . j^j EfAJ^f . 

The Haar support of a function f G L^(o") is the collection {I : f(I) i^ 0}. 

2.3. Random Dyadic Grids. Let V be the standard dyadic grid in R, thus all intervals [0,2"^] for n G N 
are in V. A random dyadic grid V is specified by cu = [wns G {0, 1}^, and the elements are 



I=t+a):-t+ Y_ ^"''^n, tGP. 



n:2-'^<|I| 

The natural uniform probability measure P is placed upon {0,1}^. 

Fix < £ < 1 and r G N. An interval I G P is said to [t^T]-bad if there is an interval J G P 
with IJI > 2^|I|, and dist(I,9J) < |I|'^|J|^^'^. Otherwise I is said to be [t^r]-good. These are the basic 
properties of this definition. 

Proposition 2.2. These three properties hold. 

(1) The property ofl = I-j-cu being [eyr]-good only depends upon cu and |I|. 

(2) Pgood '■= ]P(I is (£,r)-good) is independent of I. 

(3) Pbad '■= 1 - Pgood ^ £"^2"". 

Proof. An interval I = I-i-tu is equally likely to be the left or right half of its parent n\,l, depending only 
on cun, where |I| = 2"-. Similarly, I is equally likely to be any one of the 2^ potential positions in tt^^I, 
and its exact position is determined by {cUn, . . . , cun+t-ll- This proves the first two claims. 

For the last, if I is bad, then for some t > r, there holds dist(I, dn\,l) < 2'^^^'^|I|. For this to happen, 
it is necessary that the numbers {cuj : n -|- [(1 — £)t] < u < n -|- t — 1} all be equal, and hence are 
either all or all 1. This clearly proves that 

oo 
Pbad< ^2^-(^-rn-e)tl)<^-l2-er_ 

t=T+l 

D 
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This elementary proposition is used in the following fundamental way. Fix two weights w, a. With 
probability one, a random V is admissible for both w and a. Indeed, the collection of points that are 
point masses for one of the two weights is a fixed countable collection of points. And any fixed point 
has probability zero of being an endpoint of an interval in V. Hence, we can, with probability one, define 
the Haar basis adapted to these two weights. Write the identity operator on L^(a) by 



Pgood"f + Pbad^ 



where ?^^,,^f 



lev -A is (£,r)-good 



Use the same notation for the weight w. 
Proposition 2.3. There holds 



ipo- 
rbad 



' Her ~ ■- -^ II ' Ha ■ 



Proof. The location of I and the property of I being bad are independent, hence 



IP'^ fll 



;X 1 I - badf(I)' = PbadE^f(I)^ = ||f|| 



lev 



lev 



and then the proposition follows. 



Lemma 2.4. For any < £ < 1, there is a choice of r G N sufficiently large so that this holds. Let 
yv,(y be a pair of weights for which the constant !K is finite, and suppose there holds uniformly over 
admissible dyadic grids V, 

(2.5) |(H^Pg^„„df,P^„„dg)w| <:K||f||<,||g|U, 

then, the best constant (1.1) satisfies N < IK. 

Proof. Recall that the two weight norm inequality is uniform over all all truncations < t < 1 as in 
(1.2). Restricting Tq < t < 1 , for some fixed positive Tq, the A2 condition then implies a bound JNfxo on 
the operators H^. Indeed, for xq G R, let Iq = (xq — To/2,xo + To/2). Then, for constants Co that only 
depend upon Tq, for all Tq < t < 1 , 



lo J 



T<|x— y|<T-' y 

<Co 



o-(dy) 



w(dx) 



Jto/2<|xo-y|<2T(,^ 
< C0A2 I 



nyV^idy) 



Pio(y) o'(dy) 



|Io| 



f(y)'o-(dy) 



There is no need to have an effective control on the constant Cq. This is summed over xq G (to/2)Z, 
to complete the proof the finitenss of ^^rg, assuming the A2 condition. 

The argument below shows that ^Ntq < Ji. Use Proposition 2.3 on the good and bad projections, as 
written and the same version for L^(w). 

KH,f,g)^| < E{|(H,P^„,df,P^oodg).v| + KHaP^oodf,P^ad9>.v| 

+ KHo-Pbacjf, Pgood9)w| + KHffPbacjf, PbadS/wlj • 
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The first term is controlled by the assumption (2.5), and the remaining terms are controlled by the 
finiteness of !Nt:q and average-norm estimate on the bad projection. By appropriate selection of f G L^lc) 
and g G L^(w), there holds 

For any fixed £, we can take r > e^^ log e^^ , so that the second term can be absorbed into the left hand 
side. D 

2.4. Context and Discussion. 

2.4.1. The random grid method was pioneered in [31], and is a critical tool in non-homogeneous analysis 
[54], where the weights need not be doubling. It has a broader set of uses, as witnessed by a powerful 
representation of a general Calderon-Zygmund operator as a rapidly convergent sum of dyadic operators 
due to Hytonen [13]. 

2.4.2. The parameterization of the grids used here follows Hytonen [14], but the statistics of this 
parameterization are those of the random shift in [30,31]. 

3. Necessary Conditions 

Herein, we take up the necessity of the A2 condition from the norm inequality. Following that is the 
monotonicity property, an essential property of the Hilbert transform, and then showing the necessity of 
the energy inequality from the A2 and interval testing condition. The energy inequality is foundational 
to the proof, and is elaborated in the section on functional energy §7. 

3.1. The A2 Condition. We verify that the general A2 condition (1.13) is necessary for the two weight 
inequality (1.1), and in so doing, we can assume that a and w are compactly supported weights, which 
will ensure that different norms below are indeed finite. 

Proposition 3.1. Assume that the pair of weights do not share a common point mass, and that the 
norm inequality (1.1) holds. Then, the A2 condition (1.4) holds. 

Proof. Assume that w and ct are supported on a compact interval, so that various integrals below are 
necessarily finite. For any interval I if a has a point mass at xj, the center of I, then, we can estimate 
as follows, relying on the fact that w cannot have a point mass at xj. Using the notation of (1.11), 

— lYj — Plw,I) < |.n/2 '^wlsgn(--xi]pi](xi) 

(3.2) <yi y^ iipiIIw 

Here, we have appealed to the assumed norm inequality to deduce the last line. Rearranging, we see 
that the the A2 inequality holds when there is a point mass for ct at the center of I that dominates 
P(a, I). We will return to this point below. 

A general inequality is derived. For y < x, and any interval I, there holds 

|I|(x-y) = |I|{(x-xi)-(y-xi)} 

<(|I| + |x-xi|)-(|I| + |y-xi|) 



12 

Thus, 



MICHAEL T LACEY 



Pi(x)pi(-y) < 



1 



y <x. 



For an arbitrary a G R, use this inequality for x > a to see that 
■• r 1 

viM pi(y)^ o'(dy) < ^^ 

J(— oo,q) J(— oo,a) ^ y 

From the assumed norm inequality, there holds 



Pi(y) ^(dy) = Ha(cTpil(_oo,Q))W 



PiW 



Pi(y)' o-(dy; 



dw < 



Ha(opil(_oo,a))^ dw 



<?^^P(al(_<^„),I) 



And, rearranging, there holds 

(3.3) P(al(_^,,), I) • P(wl[,,^), I) < }^2 . 

This holds for all intervals I and a, noting that the two weights are restricted to complementary half-lines. 
Clearly, it holds with the roles of w and a reversed, and with the roles of the half-lines reversed. 

Suppose there is an a that 'evenly' divides P(a, I), in the sense that 

P(al(,,^),I),P(al(_,,,„),I)>lp(cT,I). 

It then follows immediately from (3.3) that the A2 product is controlled by !N^. 

Thus, we can assume that there is no such point a. That means that P(a, I) is dominated by a point 
mass. For some a G R, there holds pj (a)a({a}) > |P(o-, I). We assert that there holds 

(3.4) pf (a)o-({a})P(w, I) < C^^P(w, I - a) , 

Assume that (3.4) fails. That is for a small constant < c < 1, there holds 

o-({q}), 



|I| 



-P(w,I-a]<cpf(a)a({Q})P(w,I). 



Write a = Xi -|- (3|I|, and rearrange terms above to see that 



P{w, I - a) < 



1 + |3- 



rP(w,I) 



But, this cannot possibly hold, since looking at the implied integrands above, pi(x)^ < C(1-|-|3^)pi_a(t)^. 
Thus, we have a contradiction, and (3.4) holds. 

Now, a has a point mass at a, thus, the upper bound of the right hand side of (3.4) follows from 
(3.2). The proof is complete. 

D 
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3.2. The Monotonicity Principle. Certain kinds of 'off-diagonal' estimates for the Hilbert transform 
have concrete estimates in terms of the Poisson integral. This estimate makes this precise, and shows 
moreover that we need not be that careful about exactly which function appears in the Poisson integral. 
It is at the core of the entire proof. 

Lemma 3.5 (Monotonicity Principle). Suppose thaty is a signed measure, and \x is a positive measure 
witii |x > \y\, both supported outside an interval I € V^. Suppose that neither |x, y nor w has a point 
mass at an endpoint of I. There holds for any g G L^(J, w), with vj-integral zero, 

(3.6) P(^,I)(^,g)^<(H^,g)^. 

Here, g = ^T/|g(J')|lT^T^. '^ a Haar multiplier applied to g. Under the stronger assumption that ] C I is 
good with 2^\]\ < |I|, where r is sufficiently large, it holds that 

(3.7) KHY,g)^| < (H^,g)^:^P(J,^)(^,g)^. 

Note that the Poisson term, in the first estimate, is always estimated above by an inner product 
involving the Hilbert transform. In the second, note that the inner product can always be made larger by 
making the weight positive. Moreover, under moderate assumptions on the support of the weight, this 
inequality can be reversed. However, the conditions to get the reversal are particular, and this drives the 
case analysis in different sections of the proof. 



Proof. By linearity, it suffices to consider the case of g(x) 
note that 

^ -hf(x)w(dx)H(dy) 



hfix), for J C I. Now, for the first claim. 



(H^i,h 



i-iJ 



jy 



1 



1 



J y 



> 



i-i 



HI 



|I|+dist(y,I))2 



-]hY{x)w{dx]w[dx'Mdy] 

-—h^ix) w(dx)w(dx')^(dy) 



= P(H,I)(^,HP^. 

The first line follows since hi* has w-integral zero. The next line follows by inspection. In the last line, 
recall that the Haar functions are defined so that the inner product is non-negative. 

The second inequality follows from 

KHY,hr)^l < (H^,hr)^ ^ p(j, ^)(^,Hf)^. 



The function H|x is monotonically increasing on I, and we have defined the Haar functions so that the 
the inner product (H|j,, hT^)w is positive, as is (x, hT")w. So the right hand side above is non-negative. 

The Haar function hj" is also constant on both halves of J, and has w-integral zero. Thus, there is a 
monotonic increasing map cj) : J+ — > J_ so that 



(Hy, h' 



J /w 



{Ha/(x) - HY((t)(x)}hf (x) dw 
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The Haar function is positive on J+. Under the assumption that |y| < ]x, we make the difference on 
the right both bigger in absolute value, and positive, by replacing Hy with H]x. This proves the first 
inequality. 

To compare to the Poisson integral, we examine the inner product (H|j., hT^)w Let us write, for x G J 
and y G R— I, and xj the center of J, 

1 1 



y-x (y-xj) -(x-xj) 
1 1 



1 



y - XT 1 - ^^il 

Therefore, it follows that 

(H^,hf)^ = (H^-H^(xj),hf)^ 

r 1 1 



L 



rx 



'^J 



y-'^js^'y-^j 



i-i 



k=l 



(x-xj)^ 



}hf (x) dw(x)d^(y) 



-I 



J iy-^]) 



k+l "-J 



Hf (x) dw(x)d^(y) . 



Recall that the condition that J be good implies that dist(9I, J) > 2'^ '^'^|J|- The term k = 1 is 

where c, has a uniform lower bound for all r sufficiently large. All of the higher order terms are 
geometrically less than this. For k > 2, note that we will always have 



,_i(y-xj)i<+i'^^ 
And critically, by examination 



<2 



(e-l)r(k-l) 



-I (y-'cj)- 



rd|J.. 



(x-xj)^ 



Hf(x] 



<2 



-k+l 



^J 



hYM 



xG J, k>2. 



Note that the function on the right is non-negative, and integrates to (Tjr,lij^)w For r sufficiently large, 
this will complete the proof. n 

3.3. The Energy Inequality. The energy inequality is phrased in terms of the quantity 

E(w,i)2 :- |i|-2e];"|x-i-e];"x|^ 

= iir^^(x,hf)^ 

J:JCI 

Lemma 3.8. [The Energy Inequality] For any interval Iq and any partition V of Iq into intervals such 
that neither a nor w have point masses at the endpoints, there holds 



(3.9) L P(^' I^'^(^' I^'^(I) ^ CoIK^cTlIo) . 



lev 



Here, Co is an absolute constant. 
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Proof. It follows from (3.6), and simple duality considerations, and linearity of H^^ that 

P(a(Io-I],I)2E(w,I]Ml) < llHado- 1)11^2(1,^5 ^ \\^M\lni,^) + W^.Mlhn,^) ■ 
These two terms are controlled by interval testing. 

lev lev 

Thus, it follows that we have this modified version of the energy inequality, in which the support of a 
has 'holes.' 

Y_ P(cT(Io - I), I)^E(w, lfw[\] < Co^^ailo) . 
lev 

To get the Lemma as written, use the Aj condition to 'fill in the holes.' 

3.4. Context and Discussion. 

3.4.1. The necessity of the Ai condition was easily available, with an argument of Sergei Treil already 
pointed out by Sarason in his note [45]. This argument, based upon complex variables, has close analogs 
in [33,54]. The real variable proof presented herein is in [21]. The early paper of Muckenhoupt and 
Wheeden [28] contains a proof of the necessity of the half-Poisson condition, 

sup^P(w,I) + ^P(a,I)<X^ 

(The half-Poisson condition suffices for almost the entire proof; the full Poisson condition is needed to 
deduce the functional energy inequality.) Higher dimensional extensions, which are not straight forward, 
are discussed in [20]. 

3.4.2. The argument of §3.1 can be modified to prove that for general pairs of weights w and a, that 
is sharing common point masses, if the norm inequality (1.1) holds, then the two Poisson A2 conditions 
(1.13) and (1.14) hold. 

3.4.3. The energy inequality was influenced by the following assumption placed upon the pair of weights 
in [33,54]. Assume that there is a finite constant IP so that for all intervals Iq, and all partitions V of Iq, 

(3.10) ^P(a-Io,I)2w(I)<y2a(Io). 

lev 

Also assume that the dual inequality holds. In the language of Nazarov-Treil-Volberg, this is the pivotal 
condition. They proved 

Theorem C. Assume that w and a do not share a common point mass. Then, there holds 3sf < 

yiJ^^ + T + y. 
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This is a very strong Theorem, with an important proof. It decisively used the tools of non- 
homogeneous harmonic analysis, namely random grids, good-bad projections. The pivotal condition 
controlled certain degeneracies in the pair of weights, compare to Definition 4.5. To illustrate the 
difficulties in the general case, we prove this theorem in §10. 

The pivotal condition holds if the pair of maximal function estimates hold, namely M^ : L^(o") h-> 
L^(w) and Mw : l-^(w) i— > L^(o"]. And, so it offered a complete characterization of the two weight 
inequality for the triple of operators (H(j, Mq-, Mw). If the pair of weights are doubling, then the 
boundedness of the maximal functions is a consequence of the A2 condition.'^ The full characterization 
of the boundedness of the Hilbert transform was known for doubling measures. See [54]. 

The pivotal condition is generic in the following sense. Assuming the pivotal condition, the Hilbert 
transform can be replaced by a generic Calderon-Zygmund operator with one derivative on its kernel. 
This, and its extension to operators with a rougher kernel, was fundamental in the paper [40], whose 
main result was an important intermediate one in the solution of the A2 conjecture [13]. 

3.5. The functional energy inequality is also generic, in the following sense, which we describe a little 
imprecisely. Given a pair of weights a,w let £ be the best constant in the inequality (7.3), as written, 
and in its dual form. Let Tf(x) = jK(x,y)f(y) dy be an whose kernel K(y) satisfies the size and 
gradient condition 

|x-y| • |VK(x,y)| + |K(x,y)| < \x-y\-^ . 
Let Jsfy be the norm of T^ from L^(a) to L^(w), and let T be the best constant in the testing inequalities 



|Tali|^ dw < T^a(I) , and 



ITwlir da<T^w(I), 



1/2 
Then, there holds 'Nj < A-[ -|- £ -|- T. This is an element of the characterization of compactness of the 

Hilbert transform. The paper [49] formalizes this, for a class of fractional Calderon-Zygmund operators. 

Observe that T need not be bounded on L^(dx), and that requiring one derivative on the kernel is 

important. If fewer derivatives are required, then the Poisson integral needs to be changed to reflect the 

rougher kernel. 

3.5.1. Nazarov-Treil-Volberg, in language reminiscent of Sarason, wrote that 'perhaps the pivotal con- 
dition is necessary' for the boundedness of the Hilbert transform. This turned out to have a strong 
measure of truth, in that using the specific structure of the Hilbert transform, the energy inequality was 
shown necessary in [21]. Note that one can formally obtain the pivotal condition (3.10) from the energy 
inequality (3.9) by raising the energy term E(w, I) to the zero power, rather than the necessary power 2. 
The paper [21] then adapted the approach of [33,54], essentially imposing a new weaker condition on 
the pair of weights in which one raised the energy to a power intermediate between and 2. In addition, 
that paper provided an explicit example, recounted in §11, that showed that the pivotal condition (3.10) 
is not necessary for the boundedness of the Hilbert transform. 



^Alternatively, under the assumption of w being doubling, check that the energy satisfies E{w, I) > 1, with the implied 
constant depending upon the doubling constant. Thus, the necessary energy inequality implies the pivotal condition. 
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3.5.2. The energy inequality will not follow from just the A2 condition. Given interval Iq, and partition 
V of lo, one can write 

Y_ P(o-, I)'E(w, I)^w(I) < A2 Y\^\ ■ P(a, if 
i£V lev 

11'^ 



Ai 



,„|l(|I|+dist(x,I)F^^'^''^ 



To finish, one would have to know that the function inside the integral is bounded. But, this is not true 
in general. Though a very tame BMO function, this fact does not help, since a is a general measure, 
and need not satisfy any Aoo type condition. Indeed, the entire proof is more or less classical if the 
weights satisfy a mutual Aoo type condition. 

3.5.3. The monotonicity principle, Lemma 3.5, was noted in [22]. It, with the energy inequality, are 
essential aspects of the proof. 

4. The Global to Local Reduction 
Our aim is to prove the estimate (2.5), 

|(H(jPgooci"f) Pgood9)w| ^ ^ll"f||(T|l9llw • 

That is, the bilinear form only needs to be controlled for {£.,r)-goocl functions f = Pgoocjf and similarly 
for g, goodness being defined with respect to a fixed dyadic grid. Suppressing the notation, we write 
'good' for '(£,r)-good,' and it is always assumed that the dyadic grid V is fixed, and only good intervals 
are in the Haar support of f and g. We clearly remark on goodness when the property is used; any 
value of < £ < I is sufficient for our purposes. The symbol e is kept throughout, as a guide to the 
appearance of the good property of intervals. 

The inequality above is reduced to the local estimate, (4.16), at the end of this section. It is sufficient 
to assume that f and g are supported on an interval Iq; by trivial use of the interval testing condition, 
we can further assume that f and g are of integral zero in their respective spaces. Thus, f is in the linear 
span of (good) Haar functions hf for I C Iq, and similarly for g, and 

(4.1) (H,f,g)^= Y. (HaArf,Afg)^. 

IJ:IJClo 

The double sum is broken into different summands. Many of the resulting cases are elementary, and 
we summarize these estimates as follows. Define the triangular bilinear form 

(4.2) B^b°-(f,g):= ^ ^ E^^A^f • (H,Ij, Af g)^ 

I:ICloJ:JaI 

where here and throughout, J <£ I means J C I and 2^+^|J| < |I|. In words, J is strongly contained in I. 
In addition, the argument of the Hilbert transform, Ij, is the child of I that contains J, so that AJ^f is 
constant on Ij. Define B'^^'°™(f, g) in the dual fashion. This Lemma is proved in §5. 

Lemma 4.3. There holds 

|(H,f,g)w-B^^°^^(f,g)-Bb^'°™(f,g]|<:K||f|U||g|U. 
Thus, the main technical result is as below; it immediately supplies our main theorem. 
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Theorem 4.4. There holds 

|B^^°-(f,g)|<5{||f|U||g|U. 

The same inequality holds for the dual form B''^'°™(f, g). 

In the remainder of this section is devoted to a reduction of the global Theorem 4.4 to a local estimate 
described in Theorem 4.10. In the local estimate, the function f is more structured in that it has bounded 
averages on a fixed interval, and the pair of functions f, g are more structured in that their Haar supports 
avoid intervals that strongly violate the energy inequality, in the following sense. 

Definition 4.5. Given any interval Iq, define -Fenergy(Io) to be the maximal subintervals I c Iq such that 

P{a • lo, I)^E(w, I)^w(I) > 10Co?C^o-(I) . 

Here, Co is the constant in (3.9). There holds a(U{F : F G J^[lo)}) < ^o"(Io), by the energy inequality. 

We make the following construction for an f G Lq(Io,o'), the subspace of L^(Io,o") of functions of 
mean zero. Add Iq to J^, and set af(Io) '■= Ef |f|. In the inductive stage, if F E J^ is minimal, add to J^ 
those maximal descendants F' of F such that either (a) Ep|f| > 10af(F), or (b) F' G J^energy(F). Then 
define 

/af(F) Ep^|f| < lOaf(F) 
lEp|f| otherwise 

If there are no such intervals F', the construction stops. We refer to J-" and af(-) as Calderon-Zygmund 
stopping data fori. Their key properties are collected here. 

Lemma 4.6. For J^ and (Xf{-] as defined above, there holds 

(1) Iq is the maximal element of T. 

(2) For alll£V,lC Iq, we have \Ef-f\ < lOaf (ttj-I). 

(3) ocf is monotonia IfY^V € J-" and F C F' then af(F) > af(F'). 

(4) The collection T is a-Carleson in that 

(4.7) Y. ^'f^) ^ ^^^^'^^ S ^ ^• 

FeJ-; FCS 

(5) We have the inequality 



(4.8) ||^cXf(F)-F 



^l|f||a. 



FeJ- 



Proof. The first three properties are immediate from the construction. The fourth, the a-Carleson 
property is seen this way. It suffices to check the property for S € J^. Now, the J^-children can be in 

7energy(S), which Satisfy 

Y_ cT(F')<i^a(S). 

r G-/ energy I'^J 

Or, they satisfy Ep|f| > lOE^Ifl, but these intervals satisfy the same estimate. Hence, (4.7) holds. 
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For the final property, let ^ C J-" be the subset at which the stopping values change: If F G J^ — ^, 
and G is the ^-parent of F, then af(F) = af(G). Set 



Of 



FeJ":7tgF=G 



Define Gk :- [(Dq > 2^}, for k = 0, 1, . . . . The a-Carleson property implies integrability of all orders in 
a-measure of Oq. Using the third moment, we have a(Gk) < 2^^'^a(G). Namely, expanding the integral 
and using the Carleson measure property of J^, 



03 


a(dx) 


= Z 


o-(Fi 


nF2 


nFs) 




F 




7rgFj=G, j=l,2,3 

<6 Y. ' 

Fi,F2,F3eJ- 

F3CF2CF1CG 


7(F3] 












< Y. ""f^^) 


< 


L 


ct(Fi)< 


o-(G). 






F,,F2eJ- 

F2CF,CG 


1=1 


GJ-:F, 


CG 





It follows that a{Gk) < 2-^%(G] 
Then, estimate 



FeJ- 



>"af(F)-F =||>"af(G)OG 
■* — (J II ■* — 

Gee 

00 

<||X(k + l)+i-i^af(G)2'^lG 
k=o Geg 

00 

<Y^k+]nYocAG)2HG^[x) 



k=0 

00 



Geg 



<^(k + l)2^af(G)^22MGk) 
k=o Geg 

<_^af(G)MG)<||Mf||2,<||f||2. 
Gee 

Note that we have used Cauchy-Schwarz in k at the step marked by an *. In the step marked with **, 
for each point x, the non-zero summands are a (super)-geometric sequence of scalars, so the square can 
be moved inside the sum. Finally, we use the estimate on the a-measure of Gk, and compare to the 
maximal function Mf to complete the estimate. 

D 

We will use the notation 

leV:njrl=f 
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and similarly for Qp'g. (Note that both are projections, but Ppf is a structured function, while Qp'g is 
not.) The inequality (4.8) allows us to estimate 



(4.9) 



^{af(F)a(F)i/2 + ||Pff||J||QFg| 



< 



1/2 



< llfll 



^{cXf(F)2a(F) + ||Pff||2}x^||Qrg||: 
-FeJP FeJP 

We will refer to as the quasi-orthogonality argument. It is very useful. 

The Theorem below is the essence of the reduction from a global to local estimate in our proof. 

Theorem 4.10. [Global to Local Reduction] There holds 

, l,T,gj — bjr lt,gj| < Jt||t||o-||g||w, 

where B3r^°"^(f, g) ■- Y_ B''^°"'(P^f, Q^g) • 

FeJ- 

Proof. We have 

B^b°^^(f,g) = ^ Y_ B^'^°^"(P^,f,Q^g]. 

The form B^°^^(f, g) is the case of F = F' in the double sum above, hence we should bound 
^B^b°^^((DFf,Qrg), where O, := Y. ^pf- 

The argument will be different based upon the distinction between J C F and J ^ F. Write Qji = 
gj + gf where 

g^ :- Y_ Aj^g 

J:7t^J=F, J^F 

We argue the case of gp first. The functions {g^ : F G J^} are 7"e-adapted, in the sense of Definition 7.1, 
and the basic tool is the functional energy inequality (7.3). 

In order to appeal to the monotonicity principle, make a 'hole' in the argument of the Hilbert transform. 
The argument of the Hilbert transform is If, the child of I that contains F. Write If = F + (Ip — F), 
and use linearity of Hq-. Note that by the standard martingale difference identity and the construction 
of stopping data, 

I ^ E^^Aj^f] < (Xf(F) , FE^. 

I:I2F 

Hence, the first term is 



gF)w| 



(4.11) 1^ Y. E^pAj^f • (H^F,g^)^| < ^af(F)|(H^F, 

feT I : I3F FeJP 

<:K^af(F)a(F)i/2||g^||^. 
Fe^ 

This just uses interval testing. Quasi-orthogonality (4.9) bounds this last expression. 
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For the second expression, when the argument of the Hilbert transform is Ip — F, first note that 

I : I3F F'eJ" 

Therefore, by the definition of 7g-adapted, the monotonicity property (3.7) applies, and yields 
(4.12) |^Ef^A5^f.(H,(lF-F),g^)^|< ^ P((Da, J)(^, Jg^) , F G ^. 

Here, of course, it is important that the intervals in the Haar support of gp are strongly contained in 
F, and J'*[V) are the maximal good intervals J ^ F with TtjrJ = F. The sum over F G J^ of this last 
expression is controlled by functional energy, and the quasi-orthogonality property that ||0||(j < ||f||(T. 

We return to the functions gp defined at the beginning of the proof; the Haar supports of these 
functions are 'close to F'. Define functions 



9'f---- 



Y_ gl', 



s G 



F' :7t5^F'=F 



Here, the sum is over F' which are s steps below F in the T tree. It is straight forward to verify that the 
functions {g[^^} are also 7"g-adapted. Hence, by the argument for gp, there holds 

1/2 



1^ YElAff.{H,l,,g; 

FeJP I : I3F 



T+l\ 



<^l|f|l 






It remains to establish the estimate below, uniform over 1 < s < r and F G J^. 

I Y_ I_ Ef^,Arf-(HJp,,gf,) 
F'e.F I:F'gICF 



(4.13) 



7tl^F'=F 



<?{{cXf(F)cT(F)V2 + ||p-f||j[ Y_ llgs, 



1/2 



F'eJ- 



Here, there is a sum is over the /"-children of F. Quasi-orthogonality completes the proof. A mixture of 
elementary orthogonality considerations, and the energy inequality are needed. 
There is an elementary subcase. From (6.8), there holds 



sup |(HJJ)^|<CA:KJa(I)w(I). 

IJ : inj^o ^ 

2-^1JI<|I|<2^1JI 

By inspection |Ef^Aff| < |f(I)|a(Ij)^/^. It follows that the estimate (4.13) holds for the sum below, 
where the relative lengths of I and J are controlled. 

I Z L L Er^,Arf-(HJp,,Afgf,) 



F'eJ- I:F'QICF J:JcF' 
|I|<2^'1JI 



7tl^F'=F 



<:k Y_ L L i^wgm 



F'eJf I:F'gIcF J:JgF' 



n],f'=f 
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1/2 



< TfllP'^fll \ ^ lln^ 11^ 



F'GJ- 
7T!pF'=F 



It remains to consider the complementary sum. For each F' as in the sum, we write the argument of 
the Hilbert transform as Ip/ = F' + (Ip/ — F'). Interval testing shows that 



F'eJ^ I:F'QICF J:JQF' 

|I|>22''|] 



7t^F'=F 



Z L Z E?,Arf.(H,F',Afgf,) 

F'GJ- 
7rl^F'=F 

<:Kaf(F)a(F)i/2[ }^ ||gf,| 



1/2 



4f'=f 



For the second term, where the argument of the Hilbert transform is Ip/ — F', use the monotonicity 
property to see that 



L L L E?,,A?f-(H,{lF'-F'),Afgf, 



F'eJ" I:F'QICF J:JQF' 



7tl^F'=F 



<af(F) Y_ Y_ P(cTF,J)E(w,J)||Pfgf,||^, 



F'eJ- Je:7(F') 

7T!pF'=F 

where i7(F') is the maximal intervals J in the Haar support of gp' so that there is an I D F with 
2^^\]\ < |I|. Then, Pj^ = Hj/t/cjAw. Cauchy-Schwarz and the energy inequality (3.9) concludes this 
estimate. The proof is complete. 

D 

It remains to control B3^°^^(f, g). Keeping the quasi-orthogonality argument in mind, appropriate 
control on the individual summands is enough to control it. To describe what has been done, one must 
note that the functions Ppf need not be bounded. But, we are only concerned with averages over 
intervals where the average will be bounded. In addition this function and and Q^g are well-adapted to 
the pair of weights w, o". This is formalized in the next definition. 

Definition 4.14. Let Iq be an interval, and let 5 be a collection of disjoint intervals contained in S. A 
function f G LQ(Io,a) is said to be uniform (w.r.t.S) if these conditions are met: 

(1) Each energy stopping interval F G -?energy(Io) is contained in some S £ S. 

(2) The function f is constant on each interval S G 5. 

(3) For any interval I which is not contained in any S £ S, \E^-f\ < 1. 

We will say that g is adapted to a function f uniform w.r.t.5, if g is constant on each interval S G 5. 
We will also say that g is adapted to S. 
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In this next Lemma, the hypothesis is the local estimate, and the conclusion is the global estimate 
of Theorem 4.4. Note that it is homogeneous in g, but not f, since the term ct(Io)^''^ on the right is 
motivated by the bounded averages property of f. 

Lemma 4.15. [The Local Estimate] Assume that 

(4.16) |B^^°-(f,g)| <:K{cT(Io)i/^ + ||f||J||g|U, 

where f, g are of mean zero on their respective spaces, supported on an interval Iq. Moreover, f is 
uniform, and g is adapted to f . Then, there holds, for all f and g, 

it,gj| < Jl||t||(j||g||w 

Proof. Let T and af(-) be standard Calderon-Zygmund stopping data for f. By Theorem 4.10, it suffices 
to bound 



B3.'°^^(f,g) = XB^b°-(Pp-f,Qrg) 



Observe that for an absolute constant C, the function 

(4.17) (Caf(F))-ipff 

is uniform on F w.r.t.cSp, the J^-children of F. Moreover, the function Q^g does not have any interval 
J in its Haar support contained in an interval S G S^- That is, it is adapted to the function in (4.17). 
Therefore, by assumption, 

|B^'^°-(Pff,QFg)|<:K{cxp(F)cT(F)i/2 + ||Pff||J||QFg|U. 
The sum over F G J^ of the right hand side is bounded by the quasi-orthogonality argument of (4.9). n 

Thus, it remains to show that the local estimate (4.16) holds. The following reduction is a routine 
appeal to the testing condition. Focusing on the argument of the Hilbert transform in (4.16), we write 
Ij = lo — (lo — Ij). When the interval is Iq, and J is in the Haar support of g, notice that the scalar 

I:J«IClo 

is bounded by one. Say that f is uniform w.r.t.5, and let I^ be the minimal interval in the Haar support 
of f with J <E I. Since g is adapted to f, we cannot have Ij contained in an interval of S, and so 
|Ef f| < 1 . By the telescoping identity for martingale differences, 

£j= Y. Ef-Aff = E^,f, 
i:i-<;icio 

which is at most one in absolute value. 
Therefore, we can write 

I Y. ^EfA^f.(H,Io,Afg) 

I:ICloJ:JeI 



(4.18) 



J : J<5lo 



J:JgIo 



<Ta(Io)'/^||g 
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This uses only interval testing and orthogonality of the martingale differences, and it matches the first 
half of the right hand side of (4.16). 

When the argument of the Hilbert transform is Iq — Ij, this is the stopping form, the last component 
of the local part of the problem. 

4.1. Context and Discussion. 

4.1.1. The use of the energy stopping intervals, Definition 4.5, is motivated by the use of the corre- 
sponding intervals, under the pivotal condition (3.10), in [33,54]. However, the pivotal condition is not 
necessary for the two weight inequality, while the energy inequality is necessary from the A2 and interval 
testing conditions. 

4.1.2. Initial arguments had largely ignored the structure of the pair of functions f, g in the inner product 
(H(jf, g)w, instead concentrating on proving an intricate series of Carleson measure type estimates. This 
changed with the argument of [22], which introduced Calderon-Zygmund stopping intervals, and the 
quasi-orthogonality argument into the subject. It was only then that the functional energy inequality 
was identified, but not proved, in [22]. One should suspect that the structure of the functions is highly 
relevant to the problem, and this is expressed through the Calderon-Zygmund stopping data. A second 
source of strength from this perspective comes from the absence of canonical paraproducts. Attempts 
to introduce them induce ad hoc elements into the proof. 

4.1.3. The use of the functional energy inequality in this context follows the argument of [23], where 
this inequality was proved. 

4.1.4. This section begins with the elementary and familiar Lemma 4.3, and then argues that the control 
of the triangular form B^''°^^(f, g) splits into the 'global to local' and the 'local' part. The authors of 
[23] only had the first reduction. And, using the techniques of that paper, could prove 

Theorem D. /[23]y There holds |B^''°^^(f, g)| < [Ji + Boo}||f||a||g||w, where Ji = A^-l^ -h T, and the 
remaining constant is the best constant in 

|B^^°-(f,g)|<Sooa(Io)^/^||g||w, 
where |f| < li^, and Iq is any interval. The corresponding estimate holds for the dual from B'^^'°"'(f, g). 

This is a powerful Theorem, strongly suggesting that the Aj condition and testing the Hilbert transform 
over bounded functions is sufficient for the L^ boundedness of H(j. But, there is no obvious way to deduce 
such a result from the Theorem above. Phrasing things differently, it can be very difficult to translate 
partial information about the triangular form B^^°^^[f,g) to information about (H(yf, g)^, a potentially 
serious obstacle if a richer theory of two weight inequalities for singular integrals is to be developed. 

The parallel corona was introduced in [24] to surmount this obstacle. With it, the result that could 
be proved the first real variable characterization of the two weight inequality for any continuous singular 
integral. 

Theorem E. [Lacey Sawyer Shen Uriarte-Tuero [24]y There holds 'N ~ Aj + 'J'oo- where the latter 
constant is the best constant in the inequalities below, uniform over all intervals I, and Borel subsets 
EC I. 



IH^eI^ dw < T^a(I) , IH^IeI^ do" < T^w(I 
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(One tests the Hilbert transform on 1^, but only the weight of the interval I appears on the right.) 

The parallel corona delays the application of Lemma 4.3, this feature combined with a special function 
theory specific to Haar expansions for non-doubling measures, were the critical ingredients. 

The parallel corona has been used to give short transparent proofs of two weight inequalities for 
singular integrals. See the last page of Hytonen's survey [15] and the article of Tanaka [52]. 

4.1.5. It is natural to wonder if there are any L^^ analogs of the main Theorem. However, there are 
obstructions. To speak a bit loosely, the analog of (4.11) would be 

feT I : I3F feT 

FeJ- 

Here, we have assumed an V[(j) testing condition. There is an obvious analog of the quasi-orthogonality 
condition, namely 



Y_ cXf(F)Pa(F) < ||f||{;,(^, 



FeJ- 

But, there is no corresponding estimate for XlFeT'llSFllTp'r v when 1 < p' < 2. There does not seem 
to be any way to proceed without an additional condition being placed on the pair of weights. 

4.1.6. Return to the inequality (4.12), and note that it could be replaced by 

I:I2F JeJ*(F) '^' ^ 

Namely, on the right, we have 'made a hole' in the support of (t>. One then needs to control the sum 
over F of the expression on the right, which in the spirit of §7, could be interpreted as a two weight 
inequality for a Poisson integral 'with holes.' This is a difficult question, which we have avoided by 'filling 
in the holes', which depends critically on the a-Carleson property of the intervals in J^. 

5. The Stopping Form 
Given an interval Iq, the stopping form is 
(5.1) Bf;P(f,g):- _^ J^ E^^Aj^f • (H,(Io - Ij), Af g)^ . 

I:ICloJ:J«Ii 

We prove this for the stopping form, which completes the proof of the inequality (4.16), and so in view 
of Lemma 4.15, completes the proof of the main theorem of this paper. Note that the hypotheses on 
f and g are that they are adapted to energy stopping intervals. (Bounded averages on f are no longer 
required.) 

Lemma 5.2. Fix an interval Iq, and let f and g be be adapted to 7energy(Io]- Then, 

|B=^°P{f,g)|<^||f||,||g|U. 
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The Stopping form arises naturally in any proof of a Tl theorem using Haar or other bases. In the 
non-homogeneous case, or in the Tb setting, where (adapted) Haar functions are important tools, it 
frequently appears in more or less this form. Regardless of how it arises, the stopping form is treated as 
a error, in that it is bounded by some simple geometric series, obtaining decay as e.g. the ratio |J|/|I| is 
held fixed. (See for instance [33, (7.16)].) 

These sorts of arguments, however, implicitly require some additional hypotheses, such as the weights 
being mutually Aoo. Of course, the two weights above can be mutually singular. There is no a priori 
control of the stopping form in terms of simple parameters like |J|/|I|, even supplemented by additional 
pigeonholing of various parameters. 

Our method is inspired by proofs of Carleson's Theorem on Fourier series [5,10,25], and has one 
particular precedent in the current setting, a much simpler bound for the stopping form in [24] 

5.1. Admissible Pairs. A range of decompositions of the stopping form necessitate a somewhat heavy 
notation that we introduce here. The individual summands in the stopping form involve four distinct 
intervals, namely Io,I, Ij, and J. The interval Iq will not change in this argument, and the pair (I, J) 
determine Ij. Subsequent decompositions are easiest to phrase as actions on collections Q of pairs of 
intervals Q = (Qi, Qi) with Qi s) Q2. (The letter P is already taken for the Poisson integral.) And we 
consider the bilinear forms 

Bs(f,g)- ^E^^A^,f-(H,(Io-(Qi)Qj,A^^g)^. 
QGS 
We will have the standing assumption that for all collections Q that we consider are admissible. 

Definition 5.3. A collection of pairs Q is admissible if it meets these criteria. For any Q = (Qi , Q2) G Q, 

(1) Q2 ^ Qi c lo. 

(2) (convexity in Qi) If Q" G Q with Q^' = Qi and Q\' C I C Qi, then there is a Q' G Q with 
Q; = I and Q^ = Q2. 

The first property is self-explanatory. The second property is convexity in Qi , holding Q2 fixed, which is 
used in the estimates on the stopping form which conclude the argument. A third property is described 
below. 

We exclusively use the notation Q\^, k = 1,2 for the collection of intervals IJlQk • Q ^ Q}< i^ot 
counting multiplicity. Similarly, set Qi :={(Qi]q2 : Q G Q}, and Q^ :- {Qi)q2- 

(3) No interval K G Qi U Q2 is contained in an interval S G /"energy ( lo ) ■ 

The last requirement comes from the assumption that the functions f and g be adapted to J^energy(Io)- 
We will be appealing to different Hilbertian arguments below, so we prefer to make this an assumption 
about the pairs rather than the functions f, g. 

The stopping form is obtained with the admissible collection of pairs given by 

(5.4) Qo={(I,J) : J<iI,J<^U{S : S}}. 

In this definition S is the collection of subintervals of Iq which f is uniform with respect to. There holds 

Blo°''(f> 9) = BQo(f> 9) for f) g adapted to Jenergy(Io)- 

There is a very important notion of the size of Q. 

. 2 P(cT(Io-K),K)^ ^ 2 

size(Q) - sup .^.,^12 2_ (^'W)™- 

KeQ,US2 ^l^ll'^l JeQ2:JCK 
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For admissible Q, there holds size(Q) < 'K, as follows the property (3) in Definition 5.3, and Defini- 
tion 4.5. 

More definitions follow. Set the norm Bg of the bilinear form Q to be the best constant in the 
inequality 

|BQ(f,g)| < BQ||f||a||g||w 

Thus, our goal is show that Bq < size(Q) for admissible Q, but we will only be able to do this directly 
in the case that the pairs (Qi,Q2) are weakly decoupled. 

Say that collections of pairs Q', for j G N, are mutually orthogonal if on the one hand, the collections 
[Q})l, of second coordinates of the pairs, are pairwise disjoint, and on the other, that the collections(Q')^ 
are pairwise disjoint. (The concept has to be different in the first and second coordinates of the pairs, 
due to the different role of the intervals Qi and Qi.) 

The meaning of mutual orthogonality is best expressed through the norm of the associated bilinear 
forms. Under the assumption that Bg = ^^gfjBgj, and that the [Q^ : j € N} are mutually orthogonal, 
the following essential inequality holds. 

(5.5) Bq < \/2supBgj . 

jGN 

Indeed, for j G N, let TT-* be the projection onto the linear span of the Haar functions {hj^ : J G Q2}' 
and use a similar notation for TT? We then have the two inequalities 

Zlin^alP < llalP V lln?flP < 2llflP 

II'') yilw — liyilw' / II' 'j'llcr — ^ll'llcr- 

jeN jGN 

Note the factor of two on the second inequality. Therefore, we have 

|BQ(f,g)|<_^|B2i(f,g)| 
jeN 

= ^|B2i(nff,nfg]| 

jeN 

< >" BQj||nff||a||nfg||^ < VIsupBgj • ||f||a||g|U. 

This proves (5.5). 

5.2. The Recursive Argument. This is the essence of the matter. 

Lemma 5.6. [Size Lemma] An admissible collection of pairs Q can be partitioned into collections Q'^''^^ 
and admissible Qf^^\ for t E N such that 

(5.7) Bq < Csize(Q) + (1 + \/2] supBos,T,aii , 

t "^t 

and sup size (Q^"^"") < ^size(Q) . 

teN 

Here, C > is an absolute constant. 

The point of the lemma is that all of the constituent parts are better in some way, and that the right 
hand side of (5.7) involves a favorable supremum. We can quickly prove the main result of this section. 
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Proof of Lemma 5.2. The stopping form of this Lemma is of the form BQ(f, g) for admissible choice of 
Q, with size(Q) < CIK, as we have noted in (5.4). Define 

C(A] := supIBq : size(Q) < CAIK}, < A < 1 , 

where C > is a sufficiently large, but absolute constant, and the supremum is over admissible choices 
of Q. We are free to assume that Q-\ and Qi are further constrained to be in some fixed, but large, 
collection of intervals X. Then, it is clear that C(A) is finite, for all < A < 1. Because of the way the 
constant !K enters into the definition, it remains to show that C(l) admits an absolute upper bound, 
independent of how X is chosen. 

It is the consequence of Lemma 5.6 that there holds 

C(A)< CA+(1 + V2)C(A/4), 0<A<1. 
Iterating this inequality beginning at A = 1 gives us 

oo 

C(1) <C + (1 + \/2)C(V4) <---<C^[^]'<4C. 

t=o 

So we have established an absolute upper bound on C(l)- n 

5.3. Proof of Lemma 5.6. We restate the conclusion of Lemma 5.6 to more closely follow the line of 
argument to follow. The collection Q can be partitioned into two collections Q'^''s^ and Q^^^^" such that 

(1) Bgiarge < T, where T = size(Q). 

(2) gsmall ^ gsmall j Qsmall 

(3) The collection Qf"^^" is admissible, and sizelQf"'^") < f 

(4) For a collection of dyadic intervals C, the collection Ql"^^" is the union of mutually orthogonal 
admissible collections Qfl^^\ for L G £, with 

size(Q|;"L'")<f, Lg£. 

Thus, we have by inequality (5.5) for mutually orthogonal collections, 

JjQ Si IjQiarge ~r X3 QsmallijQsmall 

^ ±JQ\arge ~r JjQsmall ~r JjQsmall 

< CT+(1 + V2) max|BQsmall,SUpBnsmall} . 

' Le-C ^'L 
This, with the properties of size listed above prove Lemma 5.6 as stated, after a trivial re-indexing. 

All else flows from this construction of a subset £ of dyadic subintervals of Iq. The initial intervals in 
C are the minimal intervals K G Qi U Q2 such that 

P(a(l0-K),K)^ ^ 2 ^ T^' ry, 

' ' JGS2:JCK 

Since size(Q] = t, there are such intervals K. 
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Figure 3. The shaded smaller tents have been selected, and Tl is the minimal tent 
with \x{T]_) larger than p times the |a-measure of the shaded tents. 



Initialize S (for 'stock' or 'supply') to be all the dyadic intervals in Qi U Qi which are not contained 
in any element of C. In the recursive step, let C be the minimal elements S ^ S such that 

(5.9) Y. i^^K)i>P L L ^^'^D-' p = tI- 

JeQi-JcS Le£:LcS JeS2:JCL 

L is maximal 

(The inequality would be trivial if p = 1.) If C is empty the recursion stops. Otherwise, update 
£ ^ £ U £', and 5 ^ {K G 5 : K ^ L VL G £}. 

Once the recursion stops, report the collection C. It has this crucial property: For L € £, and integers 
t> 1, 



(5.10) Y. L (x,hf)^<p-^ Y. ^'^'^^ 



w\2 
J )w ■ 



L':7Tt.L'=LjeS2:JCL' 



JeQ2:JCL 



[i:= Y ('''^r)^^(''i,IJI)' 



Indeed, in the case of t = 1, is the selection criteria for membership in C, and a simple induction proves 
the statement for all t > 1 . 

Remark 5.11. The selection of £ can be understood as a familiar argument concerning Carleson measures, 
although there is no such object in this argument. Consider the measure |x on R^ given as a sum of 
point masses given by 

xj is the center of J. 

jGS2:JClo 

The tent over L is the triangular region Tl :={(x,y) : |x — xl| < |L| — y}, so that 
^(Tl]= Y (^'^Hw- 

J6Q2:JCL 

Then, the selection rule for membership in C can be understood as taking the minimal tent T^ such that 
|a,(T]_) is bigger than p times the |j,-measure of the selected tents. See Figure 3. 

The decomposition of Q is based upon the relation of the pairs to the collection £, namely a pair 
Qi, Q2 can (a) both have the same parent in £; (b) have distinct parents in £; (c) Q2 can have a parent 
in £, but not Qi; and (d) Q2 does not have a parent in C. 

A particularly vexing aspect of the stopping form is the linkage between the martingale difference 
on g, which is given by J, and the argument of the Hilbert transform, Iq — Ij. The 'large' collections 
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constructed below will, in a certain way, decouple the J and the Iq — Ij, enough so that norm of the 
associated bilinear form can be estimated by the size of Q. 

In the 'small' collections, there is however no decoupling, but critically, both the size of the collections 
is smaller, and that the estimate is given in terms of the supremum in (5.7). 

Pairs comparable to C. Define 

QL,t := {Q G Q : 7lcQ^ = n^Qj = L}, L G £ , t G N. 

These are admissible collections, as the convexity property in Qi , holding Qj constant, is clearly inherited 
from Q. Now, observe that for each t G N, the collections {QL,t : L G £} are mutually orthogonal. 
The collection of intervals (QL,t)2 are obviously disjoint in L G £, with t G N held fixed. And, since 
membership in these collections is determined in the first coordinate by the interval Qi, and the two 
children of Qi can have two different parents in £, a given interval I can appear in at most two collections 
(QL,t)i , as L G £ varies, and t G N held fixed. 

Define Q^"^^" to be the union over L G £ of the collections 

Q^7":^{QgQl,i : Qi^L}. 

Note in particular that we have only allowed t = 1 above, and Qi = L is not allowed. For these 
collections, we need only verify that 



(5.12) size(Qtr")< 7(P-1)-T=|, LG£,tGN. 

Proof of {5.12). An interval K G (QfY^")^ U Qi is not in C, by construction. Suppose that K does not 
contain any interval in C By the selection of the initial intervals in £, the minimal intervals in Q] U Qz 
which satisfy (5.8), it follows that the interval K must fail (5.8). And so we are done. 

Thus, K contains some element of £, whence the inequality (5.9) must fail. Namely, rearranging that 
inequality, 

JeQ2 : 7t£j=L L'e£ : L'cK JGS2 : JCL 

JCK L' is maximal 

Recall that p — 1 = jz. We can estimate 



1 ^— , . .jv\2 



Y_ (^,Hf)w<^ Y. ^^'^/w 

jGQ2:7t£j=L JeQ2:JCL 

JCK 

T^ |K|2 . o-(K) 



16 P(a(L-K),K)2' 
The last inequality follows from the definition of size, and finishes the proof of (5.12). n 

The collections below are the first contribution to Q'^''s^. Take Q^'^^^ '■= U{Q^'^^ : L G £}, where 

Qi7'-=WgQl,i : Qi=L}. 

Note that Lemma 5.17 applies to this Lemma, take the collection S of that Lemma to be {L}, and the 
quantity rj in (5.18) satisfies rj < t = size(Q), by inspection. From the mutual orthogonality (5.5), we 
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then have 

Enlarge < v2sUpB^large <T. 

The collections Q\_^t, for L G £, and t > 2 are the second contribution to Q'^''^^, namely 

Q'r ^= u u qm . 

Le£t>2 

For them, we need to estimate Bg^^. 

(5.13) Bq,, <p-V2t. 

From this, we can conclude from (5.5) that 

t>2 



< \/2^supBql^ <T^p-^/2<T. 



t>2 '-^■^ t>2 

Proof of (5.13). For L G £, let 5l, the ^-children of L. For each Q G QL,t. we must have Qi C 
tt^lQ^ "^ Qi- Then, divide the collection Ql,! into three collections Q^f ^ = 1)2,3, where 

and Q^t •" 2i-,t ~ (2Lt ^ Qit^ '^ ^^^ complementary collection. Notice that Q[^ equals the whole 
collection Q^t for t > r + 1 . 

We treat them in turn. The collections Q[^ fit the hypotheses of Lemma 5.17, just take the collection 
of intervals S of that Lemma to be Si. It follows that Bqi < 3(t), where the latter is the best constant 

L,t 

in the inequality 

(5.14) Y. P(cT(Io-K],J)2(^,hf)^<P(t)2cT(K), Kg5l, Lg£, t>2. 

There is an observation about the Poisson integral terms that we need. For K as above, and J C L' d K, 
note that by goodness of L', 

dist(J,Io-K) >dist(L',Io-K) > IL'^K]^"^ > 2'"+^5'^-^'|L'| . 

From the definition of the Poisson integral, one sees that 

,...^ P(o-(Io-K)J) ^ P(o-(Io-K),LO 



We have the estimate without decay in t, |3(t) < size(Q). Indeed, for K as in (5.14), let J* be the 
maximal intervals with J* G (QL,t)2 and J* <m K. Now, J'* is contained in the collection of intervals over 
which we test the size of Q, hence by (5.15), 



X , w\2 



LHS(5.14)=^ Y_ P(cT(Io-K],J]^(-hf); 

J*eJ*je(QL,t)2:JcJ* '^' 
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~ 2_ |pH2 2_ \^' '^J /w 

J*e:7* 1^ I Je(SL,t)2:Jcj* 

<t2 ^ a(r) <t2o-(K). 

This proves the claim, and we use the estimate for t < r + 3, say. (Recall that r is a fixed integer.) 

in the case of t > r + 3, the essential property is (5.10). The left hand side of (5.14) is dominated by 
the sum below. Note that we index the sum first over L', which are r + 1-fold ^-children of K, whence 
L' (E K, followed by t - r - 2-fold ^-children of L'. 



z L L PNio-K)j)^(-,hf): 

JeS2:JCL" 1^1 

=L' 

2_ T7^ 2_ ^^'N 






L'6£ ' ' L"e/: 



~ P 2_ 17712 z_ \ ' J 'w 



< P"V ^ ct(L') < x^p-^a[K] 



L'e£ 



'£ 



We have also used (5.15), and then the central property (5.10) following from the construction of C, 
finally appealing to the definition of size. Hence, |3(t) < Tp^^^^. This completes the analysis of Q]_i- 

We need only consider the collections Q^^ for ] < t < r -|- 1 , and they fall under the scope of 
Lemma 5.22. And, we see immediately that we have Bqi < t. Similarly, we need only consider the 

L,t 

collections Ql^ for 1 < t < r -M . It follows that we must have 2^ < IQ1I/IQ2I < 2^^+^. Namely, this 
ratio can take only one of a finite number of values, implying that Lemma 5.24 applies easily to this 
case to complete the proof. n 

Pairs not strictly comparable to C. It remains to consider the pairs Q € Q such that Qi does not have 
a parent in C The collection Ql"^^" is taken to be the (much smaller) collection 

gsmaii ■- {Q (z Q : Qj does not have a parent in Q . 

Observe that size(Q|"^^") < ^(p — 1)t < ^. This is as required for this collection.^ 

Proof. Suppose r\ < size(Q|"'^"). Then, there is an interval K G (Qf^")i U (Ql""^")! so that 

j6(Sr=")2 
JCK 



The collections Q]'"^ and Qf^^ are also mutually orthogonal, but this fact is not needed for our proof. 
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Suppose that K does not contain any interval in £. It follows from the initial intervals added to £, see 
(5.8), that we must have rj < |. 

Thus, K contains an interval in £. This means that K must fail the inequality (5.9). Therefore, we 
have 

' ' JeS2 

JCK 

This relies upon the definition of size, and proves our claim. n 

For the pairs not yet in one of our collections, it must be that Q2 has a parent in £, but not Qi. 
Using £*, the maximal intervals in £, divide them into the three collections 






Ql-Se 



{QgQ 



Qi^nc^QiC Q^}, 



Q2<^7T£.Q2<E Qi}, 

Q2 ^ 7t£*Q2 c Qi ,and 7t£.Q2 ^ Qi}. 



:= {Q G Q 

{QgQ 

Observe that Lemma 5.17 applies to give 

(5.16) Enlarge <T. 

Take the collection S of Lemma 5.17 to be £*, and note that the bound in that Lemma is given by rj, 
as defined in (5.18), which by construction is less than t = size(Q). 

Observe that Lemma 5.22 applies to show that the estimate (5.16) holds for Q^'^^^. Take S of that 
Lemma to be £*. The estimate from Lemma 5.22 is given in terms of rj, as defined in (5.23). But, is 
at most T. 

In the last collection, Q^'^^^, notice that the conditions placed upon the pair implies that |Qi| < 
2^^+^|Q2|, for all Q G Qs'^^^- It therefore follows from a straight forward application of Lemma 5.24, 
that (5.16) holds for this collection as well. 

5.4. Upper Bounds on the Stopping Form. We have three lemmas that prove upper bounds on the 
norm of the stopping form in situations in which there is a measure of decoupling between the martingale 
difference on g, and the argument of the Hilbert transform. 

Lemma 5.17. Let S be a collection of pairwise disjoint intervals in Iq. Let Q be admissible such that 
for each Q G Q, there is an S £ S with Q2 <e S C Qi . Then, there holds 

|BQ(f,g]| <Ti||f||(j||g||w, 
(5.18) where ^^■.= snp-^ ^ P(a(Io - S), J)2(^,hf )J^ . 

(Note that size(Q) need not control r\.) 

Proof. An interesting part of the proof is that it depends very much on cancellative properties of the 
martingale differences of f. (Absolute values must be taken outside the sum defining the stopping form!) 
Assume that the Haar support of f is contained in Q-\ . Take J-" and af (•) to be stopping data defined 
in this way. First, add to J^ the interval Iq, and set af(Io) '■= Ef |f|. Inductively, if F G J^ is minimal, 
add to T the maximal children F' such that af(F') := E^,|f| > 4af(F). Note that the inequality (4.8) 
holds for this choice of J^ and af , so that the quasi-orthogonality argument (4.9) is available to us. 
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Write the bilinear form as 

BQ(f,g)=^(HaCPj,Afg)^ 
J 
(5.19) where cpj :- ^ EfA^^f • (lo - Qi ; 



QeS:Q2=J 



The function cpj is well-behaved. For any J e Q2, I^Pjl ^ «.f(7tjrJ)AJ. In this definition, AJ := U{Io — Qi '■ 
Q G Q, Qi = ]}■ Indeed, at each point x G AJ, the sum defining (pj(x) is over pairs Q such that 
Q2 = J and X G Iq — Qi. By the convexity property of admissible collections, the sum is over consecutive 
martingale differences of f. The basic telescoping property of these differences shows that the sum is 
bounded by the stopping value af(7Tj-J). Let I* be the maximal interval of the form Qi with x G Iq — Qi, 
and let I* be the child of the minimal such interval which contains J. Then, 



|(pj(x)| 



Y_ EfA^,f(x] 



(5.20) 



Q6Q:Q2=J 
xGl-Qi 



if,f- 



Ilf\<ocf[n^mio-S) 



where S is the 5-parent of J. 

We can estimate as below, for F G J-": 

E(F):=| Y. EQ,A^^f-(H<,(Io-Qi),Afg) 

QeQ:7r^Q2=F 
JeQ2: 71^1=1= 



(5.20) 



< cXf(F) Y. LP('^(Io-S)J)|(-,A]^g)^| 
se5 jeQ2 ''^' 

7t^S=F JcS 

<. cXf(F)[^ ^p(o-(Io-S),J)2(^,hf)J^x X! ^ar 



1/2 



se5 JGQ2 

7T^S=? jcs 

(5.18) r __ , 

< Ti<xf(F)[ Y_ o-(s)x Y_ §ar 

se5 jeS2 



JGQ2 



< Tiaf(F)a(F) 



1/2 



jeS2:7t^J=F 



1/2 



1/2 



The top line follows from (5.19). In the second, we appeal to (5.20) and monotonicity (3.7), the latter 
being available to us since J C S implies J ^ S, by hypothesis. We also take advantage of the strong 
assumptions on the intervals in Qi. If J G Qi, we must have ttj-J = njr[ns])- The third line is 
Cauchy-Schwarz, followed by the appeal to the hypothesis (5.18), while the last line uses the fact that 
the intervals in S are pairwise disjoint. 
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The quasi-orthogonality argument (4.9) completes the proof, namely we have 
(5.21) ^E(F)<Ti||f||<,||g||^. 
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Lemma 5.22. Let S be a collection of pairwise disjoint intervals in Iq. Let Q be admissible such that 
for each Q € Q, there is an S £ S with Q2 C S <e Qi . Then, there holds 



Bs(f,g)| <Ti||f||a||g||w, 

P(a(Qi-7t2,S),S)2 



(5.23) where tj := sup fe^ici2 

Se5 o^l^Jl^r 



JGS2:JCS 



Proof. Construct stopping data T and af(-) as in the proof of Lemma 5.17. The fundamental inequality 
(5.20) is again used. Then, by the monotonicity principle (3.7), there holds for F G J^, 

QeQ:7t^Q2=F 



Sg5:7T^S=F 



JeQ2:JCS 



Se5:7T^S=F JeS2:JCS ' ' JeQ2:JCS 

<Ti(Xf(F][ Y ^(S)x Y §(J)'''^' 

Se5:7t^S=? JeQ2:JCS 



1/2 



<Tiaf(F)a(F) 



1/2 



1/2 



Y ^^^y 

After the monotonicity principle (3.7), we have used Cauchy-Schwarz, and the definition of rj. The 
quasi-orthogonality argument (4.8) then completes the analysis of this term, see (5.21). n 

The last Lemma that we need is elementary, and is contained in the methods of [33]. 

Lemma 5.24. Let u > r + 1 be an integer, and Q be an admissible collection of pairs such that 
IQi I = 2^1 Q2I for all Q G Q. There holds 

|BQ(f,g]| <size(Q)||f||a||g|U. 

Proof. Recall the form of the stopping form in (5.1). Observe, from inspection of the definition of the 
Haar function (2.1), that 

|E?A?f| <^M-. 
' ^i ^ ' - o-(Ij)V2 

Then, we have, keeping in mind that Ij is one or the other of the two children of I, 

|Bs(f,g)|< X|f(I)| Y o-(Ij)-i/^P(a(Io-Ij),J)(^,hf)„|t(J]| 
leQi J:(lJ)eQ '^' 
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4eQ, '-J:(iJ)eS ^ >' '^' 

< size(Q)||f||a||g||w 

This follows immediately from Cauchy-Schwarz, and the fact that for each J G Q2, there is a unique 
I G Qi such that the pair (I, J) contribute to the sum above. n 

5.5. Context and Discussion. 

5.5.1. In the functional energy inequality, one 'stops' at a-Carleson family of intervals, where as in 
the stopping form, every interval is 'stopping': the Haar function applied to g and the argument of the 
Hilbert transform are coupled. This leads to many complications, such as the functional energy inequality 
has a nearly intrinsic formulation, while the stopping form does not. The proof herein succeeds because 
the notion of size approximates the operator norm of the stopping form. Moreover, the 'large' portions 
of the stopping form, there is a decoupling that takes place. 

5.5.2. It is very interesting that one can prove unconditional results about the two weight Hilbert 
transform, following the techniques in [23], without solving the local problem. 

6. Elementary Estimates 

This section is devoted to the proof of Lemma 4.3. The estimates of this section are of a more classical 
nature, albeit the A2 assumption is critical. (In fact, all the estimates in this section depend only on the 
A2 hypothesis, but this is not tracked in the notation.) First some basic estimates are collected. This is 
property of good intervals, which can be effectively used in non-critical situations. 



Lemma 6.1. For three intervals J, 1,1' eV with J C I C I', 
(6.2) P(a • (I' - I), J) < 2-'^-^'^P(o- • I', I) . 
Proof. Note that for x G I' — I we have 



2 ^^jlj, with s > r and J good, then 



dist(x,J) > |I 



1-£| 



7s[l-el 



Using this in the definition of the Poisson integral, we get 



P(o--(l'-I)J)<2 



_i dist(x,J)^ 



a(dx) 



HI 



HI 



I,_l(|J|+dist(x,J))2 



o-(dx) 



<2 



-sn-2£) 



HI 



I,_l(|I|+dist(x,I)]2 

Proposition 6.3. Suppose that two intervals I, J G D satisfy |I| > 



a(dx) =2-^'i-2^'P(a(I'-I),I). 
and 31 n J = 0, then 



a 



(6.4) \{H^l,hYU < a{l] Jw[]) 



+ dist(I,J)]2 
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Proof. Since hj^ has w-integral zero, estimate as below, where Xj is the center of J. 



(HJ,h- 



J /w 



I J 



1 



1 



hf(x) w(dx)a(dy) 



{— —UYM w(dx)a(dy) 



X — X 



^ h^ix) w(dx)a(dy) 



(y -^)[y -xj) 

The Lemma follows by inspection. (This is a very simple version of the monotonicity principle.) 



Proposition 6.5. Suppose that two intervals I, J G P satisfy 2^\]\ = |I|, wiiere s > r, the interval J is 
good, and J C 31 \ I, then 



(6.6) KH,I,hf)^| < 2-fi-2^'V(I) Y^w(J)|ir^ 

Proof Under the assumption of the Lemma, the proof of Proposition 6.3 holds, supplying the estimate 
estimate of that Lemma. But, the extra assumption that J is good implies that dist(J, I) > 2^'^^*^'!}!, 
and then the estimate follows by inspection. 

D 

6.1. The Weak Boundedness Inequality. The following inequality is a weak-boundedness inequality, 
a consequence of the Aj inequality. 

Proposition 6.7. There holds for all integers r, and for all intervals disjoint intervals I, J 

(6.8) \{H,l,])^\<Al^^[cr[l)w[J)]'/\ 

provided (a) 2^^ < 14 < 2^ ; and (b) the two weights do not have a point mass at an endpoint. The 
implied constant depends upon r. 

For the proof, we will have recourse to Muckenhoupt's characterization of the two weight inequality 
for the Hardy operator [27]. 

Theorem F. For weights t^ and a supported on R.+. 

(6.9) fcT(dy) <S||f||<,, 



<i> 



(6.10) where 23 ~ sup 

0<r<Q 



Mdx) X 



a(dy]. 



Proof of Proposition 6.7. Without loss of generality, suppose that J = (— a, 0), for a > 0, and I = (a, |3), 
where < a < |3. Write 

w(dx)a(dy) 



(HJ,J)v 



-a,0)x(a,P] 



H-x>y}- 



+ 



;-a,0)x(«,|3) 



H-x<yy 



y-^ 

w[dx)(y[dy) 
y-x 



I + E. 
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If a > a, then I is zero, and we have 

Notice that the implied constant depends upon I and J having comparable lengths. 

Assuming that < a < a, the two terms are similar, and we only consider I. The point of the diagonal 
restriction is that it brings the Hardy inequality into play. Set z = — x and and w(dz] = -7w(— dz). 



I 



1 



y + z 



o-(dy)(y)w(-dz) 



< 



1 



Jo z 



a(dy)w(-dz) 



a(dy)l w(dz)l < S[o-(I)w(J)]^/^ 



Here, we have applied Cauchy-Schwarz and the Hardy inequality (6.9), and the constant 23 is as in 
(6.10). 

It remains to estimate the constant S, as given in (6.10), by the A2 constant. 



dw 



cT(dy] 



rdw X - 
r 



o-(dy) 



P(w,(-r,0))x ^"Q'^" <yi,. 



6.2. The Different Subcases of Lemma 4.3. Lemma 4.3 follows from appropriate bounds on these 
bilinear forms, and their duals. 



B--^y(f,g):= Y. KH.A?f,Af4))^|, 



IJ:2-'1I|<|J|<|I| 

3inj#0 



Bf-(f,g):- Y. KHaArf,A}^(t))^|, 



IJ:2'-|J|<| 

3inj=0 



B^'°-(f,g)- Y. KHaArf,Af4))^|, 



IJ:2nj|<|I| 
JC3I\I 



B^dj--*(f,g):= Y K-i,Arf(H,(I-Ij],Af4))^|. 

IJ:JaI 

Lemma 6.11. For * G {nearby, far, close, adjacent}, there holds 



1/2, 



Bnf,g)<^2 ||flUllg||w. 
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6.3. The Nearby Term. One can check directly that for each interval I, with child I', there holds 
\Ef,hf\ < (y{l')-'^/^. It then follows from (6.8) that \{H^hf,hy)J < Vi. And then, 

B--^y(f,g) <^ Y_ |f(i]0(J)U^I|f||a||glU. 

IJ:2-^|I|<|J|<|I| 

The last line follows from the fact that for each I, there are only a bounded number of J occurring in 
the sum. 

6.4. The Far Term. For integers s > r , the sum below specifies a relative length for the interval J with 
respect to I. 

IJ:2^IJh|I| 

jn3i=0 



^ Y_ i^(i)^( 



IJ:21JI=|I| 

[n3i=0 



v/o-(I)w(J)|J| 
|J|+dist(I,J)F 






Lifmi[ L 



J:21J|HI| 

jn3i=0 



CT(I) Y. 



w 



<yif ^|f(i)|[ ^ 



j^,^^,|(lJl+dist(I,J))^ 
jn3i=0 



J:21J|=|I| 

jn3i=0 



■ (|J|+dist(I,J))^ 



1/2 



I J:2=|J|=|I| 

jn3i=0 



+ dist(I,J)]2 



<yiyV^||f| 



u[ L 



0(j) 



2|Tl2 



IJ:2^IJI=|I| 

jn3i=0 



+ dist(I,J])2 



1/2 



,1/2 



^-^2 llflUllgl 



The second line follows from (6.4), and then using Cauchy-Schwarz, so that one can appeal to the A2 
condition. The last line follows by inspection. This estimate is summable in s > r, so this case is 
complete. 



6.5. The Close Term. For integers s > r, the sum below a relative length of J with respect to I. 



Y_ KH,A^f,Af4)), 



IJ:2^IIIHI| 
JC3I\I 



<2 



IJ:21Jh|I| ' ' 

JC3I\I 
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^^ 



I I I J:2^IJI=|I| 

JC3I\I 

<2n-2e)s^if(i)i^[ Y_ m)'x Y_ M])]''' 

I I I J:21J|=|I| J:2=IJIHI| 

JC3I\I JC3I\I 

I J:21J|=|I| 

JC3I\I 

< 2{^-2c] ny^W-fW llnll 

~ -^ "^2 II iifiiy iiw • 

This follows from applying (6.6), followed by Cauchy-Schwarz, and the simple A2 bound. The last line 
follows since for each J, there is a unique choice of I for which the pair (I, J) appears in the sum. Since 
the last bound is summable in s, this completes the estimate. 

6.6. The Adjacent Term. We argue as in the previous case. It is easy to see that |E5^_j AJ^f| < 
|f(I)|a(I-Ijr^/2. 

For Q ^ d' ^ {lb}, and consider the sum below, where s plays the same role as before. 



Y_ KAff-{H,le,AYgU<2-^'-^''>' Y. l^^^MDl 



Vo-(Ie)w(J) 



IJ:2^IJh|I| IJ:2=|J|=|I| ' ' 

Jci+(e'|i|) jci+(e'|i|) 



< ;-(l-2e)s/,l/2||f|| I 
~ ^ -^1 Holier I 



The details are suppressed. 



6.7. Context and Discussion. The techniques of this section are all drawn from the work of Nazarov- 
Treil-Volberg [33,54], aside from the use of the two weight Hardy inequality, which is drawn from [21]. 

7. Functional Energy Inequality 

We state an important multi-scale extension of the energy inequality (3.9). 

Definition 7.1. Let J^ be a collection of dyadic intervals. A collection of functions {gplFeJ' in l-^(w) is 
said to be J^,g-adapted if 

(1) The functions gp G Lq(F,w). 

(2) Letting J[¥) = {] : gf[]) + 0}, these collections are contained in {J : ttj-J = F, J <e F}. 

The main result of this section is this extension of the energy inequality (3.9). 

Theorem 7.2. The inequality below holds for all non-negative h G L^(o'), all a-Carleson collections T , 
and all T^-adapted collections {gf]fej^: 

r n 1/2 

(7.3) Y. L P(^'^'r)K^>9Fr)^|<:K||h|U ^||gp||: 

Here J'*[V) consists of the maximal intervals J in the collection J'[V). 
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The inequality above should be thought of as a two weight inequality for the Poisson integral, a 
decisive step, since there is a two weight inequality for the Poisson operator proved by Sawyer [48], 
which is recalled in §8. It reduces the full norm inequality above to simpler testing conditions. The 
testing conditions are expressed in terms of the Poisson integral, many of which are controlled by the A2 
condition. There is one that, in view of the monotonicity principle, can be turned into interval testing 
conditions for the Hilbert transform itself. 



7.1. Two Weight Poisson Inequality. Consider the weight \x on 



l^ given by 



pw 



•6f 



Here, P^j 



■ ^reJ(F) ■ I'd ^V' ^^ ^^" replace x by x — c for any choice of c we wish; the projection 



is unchanged. And 6q denotes a Dirac unit mass at a point q in the upper half plane 



^+- 



We prove the two-weight inequality for the Poisson integral: 

l|P(M|lL2(R2,^.)^^l|H||a, 

denotes the Poisson extension to the upper half-plane, so that in 



for all nonnegative h. Above 
particular 



0- 



Il2(R2,^) 



Y_ Y. Jpf^f^Kxj, 



ipw ^||2 
I FJ ITI llw 



Fe^JeJ(F] 



where xj is the center of the interval J. The proof of Theorem 7.2 follows by duality. 

Phrasing things in this way brings a significant advantage: The characterization of the two-weight 
inequality for the Poisson operator Theorem G reduces the full norm inequality above to these testing 
inequalities. For any dyadic interval I £ V 



(7.4) 
(7.5) 



(o■•I)M^(x,t) <:K^a{l), 



31 



'itt[L)^a{dx) <A2 



31 



t^M,(dx, dt), 



where I = I x [0, |I|] is the Carleson box over I in the upper half-plane, and 31 
Poisson operator P* is 

+2 



(3I)x[0,|I|]. The dual 



**(tI^](x) 



-M-(dy,dt). 



tt^ + lx-yl 

One should keep in mind that the intervals I are restricted to be in our fixed dyadic grid, a reduction 
allowed as the integrations on the left in (7.4) and (7.5) are done over sets that are not in the dyadic 
grid. (Goodness of the intervals I above is not needed.) This reduction allows an appeal to the interval 
testing condition in the next subsection. 

7.2. The Poisson Testing Inequality: The Core. This subsection is concerned with this part of 
inequality (7.4): Restrict the integral on the left to the set I C M.+ 

[a-I)^d^(x,t) <5{a(I). 
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Since (xj, |J|) G I if and only if J C I, we have 



a.I)(xj,|J|]2||p-^|'' 



= L L 

I'eJ^JeJ^ni^): Jci 
F6JFje>7*m: Jci 



X 1,2 
w ' 



For each J, 
(7.6) llPn^f < 

V / II r,nT llw — 



ifx 2 



dw(x) = 2E(w, J)^w(J) < 2w(J) 



A straight forward estimation is not possible, because the intervals J overlap. The intervals T obey a 
o"-Carleson measure condition, which we exploit in the first stage of the proof. We 'create some holes' 
by restricting the support of ct to the interval F in the sum below. 



— 11^ 

^j|j|IL 



X ii2 
w 



FeJ-jeJ-*!?): Jci 

FeJ-: FCI FeJ-:F3I j6j*(F]: Jcl ^ 

= A + B. 

The first of these terms is at most 

A< _^ Y_ P(Fa,J)2E(w,J)2w(J) 
FeJ-: Fcijej-'lF) 

< :K^ ^ a(F) < :K^a(I) . 

FeJf: FCI 

Here we have used (7.6), the energy inequality (3.9), and that the stopping intervals T satisfy a a- 
Carleson measure estimate. 

Concerning the second term, a supplemental inequality is needed: For all intervals I the collection 

/C(I) := {J C I : J is maximal in J*[¥] for some F D 1} 

are pairwise disjoint. Indeed, let Ji C Ji be two intervals in /C(I). By maximality, Ji must be strictly 
contained in Jj- Now, letting Fj = ttjtJs, there follows I C Fi c F2. But, this contradicts njrjj = F2. 
Using (7.6) and the energy inequality, term B satisfies 



B< ^ P(a.I,J)2||P^,; 



X 1,2 



jG-^fl) 



< Y_ P(o--IJ)'E(w,J)2w(J)<:k2o-(I]. 

In the first line, Fj is the unique F G J" with J G J*[V) and F D I. 
It remains then to show the following inequality with 'holes': 

Y_ Y. p(^(i-H,j)'i|p^j^iii^^Mi), 
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where J^i consists of those F G J^ with F C I. Our purpose is to pass back to the Hilbert transform, so 
that we can effectively use the testing condition. The inequality above can be expressed in dual language 
as the inequality 

Y_ Y. P(cT(I-F)J)(P^j^,g)^<:KcT(I]i/2||g|U. 

In the inner product, g G l-^(w] can be replaced by gpj := P^jg by self-adjointness of the projections. 
Also, (x, hT^)w > 0, so that we are free to assume that (g,hT^)w > for all J. 
We can estimate, using the monotonicity property (3.7), 

P(a[I - F], J) (^, gpj)^ ^ (H,(I - F), gpj)^ , ]eJ*{V]. 

It therefore suffices to show that 

(7.7) Y. L (Ha(I-F),gFjL<:H;a(I)^/2||g|U. 
Fe^i JeJ*(F) 

Use linearity in the argument of the Hilbert transform, which gives two terms. The first is 

1^ Y_ (HJ,gpj)^| = |(HJ,g)^|<J{a(I)i/2||g||^. 

The second term appeals to interval testing and the a-Carleson measure condition (4.7). 

\Y_ Y. (H.F,gpj)^|< ^|(H,F, Y 9fjLI 
FeJ-ij6j*(F) FeJ-j ]ej*{f) 

<:k^o-(f)V2|| Y_ gFj 

F6J-I JeJ-'lF) 



< Ji 



|2 
■Jllw 



1/2 



Y cT(F) X Y Y ll9F,J 

LFeJ-j F6J-I je:7*(F) 

<:Ko-(I)i/^||g||^. 
We also use the orthogonality of the functions g^j. This completes the proof of (7.7). 

7.3. The Poisson Testing Inequality: The Remainder. Now we turn to proving the following estimate 
for the remainder of the first testing condition (7.4): 



(o- • I)M^ < yi20-(i) . 



3T-T 



With Fj the unique F G J" with J G J*if), estimate 



3t-t 



(cT•I)2d^= Y ip('^-iK''j'iJn'llp^j^lll 



J:{x,,|J|)e3I-I 
J : JC3I-I 



X 1,2 
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organize the sum according to the length of J and then use the Poisson inequality (6.2), available because 
of goodness of intervals J. 



n=0 J : JC3I-I 

|Jh2-|I| 



-n(2-4e) 



|I|2 



■w(J) 



n=0 



|3IP 



7.4. The Dual Poisson Testing Inequality. We are considering (7.5). Note that the expressions on 
the two sides of this inequality are 



tV(dx,dt) = _^ Y_ \\K, 



J-'*-|lw ■> 



Jci 



llpw yI|2 

FeJ-jeJ*(F) 
Jci 



2 + |x-Xj|2 



Expand the square in ||P*(tI|j,) • 3I||^, writing it as a sum over s > of the terms Tj defined below. 
In the sum, the relative lengths of J and ]' are fixed by s. 



T.-L L L I. 

FeJ"jeJ'*(F)F'eJ=-j'e^*(F') 
JCI |J'h2-|JI 

I'd 






Ipw yI|2 



31 



+ PC 



Ijf + |x-c(J')F 



dCT 



<M^ ^ llP^jxlP 



SJ^IIw 



Jci 



where M = sup sup 2~_ z~_ 

Fe^JeJ*(F)pf^j,g^(,, 

IJ'I=2-1] 
J'CI 






31 



+ |x-xjp |J'|2 + |x-c(J')| 



rMI2 



do-. 



The term M is at most a constant times A22 ^, which is then trivially summable in s > 0. This 
argument depends upon a case analysis. Fix J as in the definition of M, and use (7.6) to estimate 
IIPfj'^'IIw ^ w(J')|Jp. Let us consider ]' with n|J| < dist(J,J'] < (n + 1)|J|. In the case that n > 3, 
further require that J' be to the right of J, and set tn = xj + j\]\. The case of ]' being to the left of J 
is entirely similar. 

Note the restriction on the range of integration below, which is the easy case. The first term in the 
integral is dominated by its L°° norm. 

IT/|2 IT/I 



in=2-^|j| 

n|J|<dist(JJ')<(n+l) 



IJ'I 



3in[tn,oo) 



+ |x-Xj|2 |J'|2 + |x-c(J 



n\2 



do- 
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< 2-^'yx-^ Y. Y. 



in=2-iji 

n|J|<dist(JJ')<(n+l) 



MY) 

IJ'I 



31 



|J'P + |x-c(J 



MI2 



da 



< l-^'n-^Ai Y_ Y. 



F'eJ- J'eJ*(F'] 

IJ'h2-^|j| 

n|J|<dist(JJ')<(n+l] 



-s„-l 



< l-'n-^Ai 



The last line follows from the fact that each ]' can be in at most one collection J*[f]. And this is 
trivially summable in n > 3 to the claimed bound. This same argument is easily adapted to the case of 
rL = 0,1,2. 

In the complementary range of integration, we also sum in n, and use the L°° norm on the second 
term in the integral. 



F'eJ"n=4 J'eJ'*(F') 

IJ'h2-^IJI 
n|J|<dist(JJ')<(n+l) 



W(J') 



r/|2 



3Inhoo,tn]lJP + |x-Xj|2 |J'|2 + |x-c(J')|2 

- w(J + n|J| 



da 



2-2S y ^nrr 



n=4 



31 



+ x-c 



rda 



<2-2^P(wj,J)P(a,J)<2-^yi2. 
Indeed, this is the only place in which the two-sided Poisson JKi condition is needed. 



8. Two Weight Poisson Inequality 

We give a proof of a two weight inequality for the Poisson integral. Let us phrase the inequality. For 
a be a weight on R, and |x be a weight on R^, consider the inequality 

(8.1) ||Paf(x,t)||L2(R2,^J<}^p||f|k2(R,,) 



x,t) := 



U+|x-y| 



dv(y) -pt * v(x) 



Theorem G. The inequality (8.1) Isolds if and only if these inequalities hold uniformly over all intervals 
I in a fixed dyadic grid V. 



31 



V(T)2 d^(x, t) < T^a(I) , 31 := (31) x [0, |I|] , 



31 



^;(tt)2 da(x) < 7l 



t^|j.(dx, dt) 



In the second line, P* is the dual operator, mapping L^(R^,(x) to L^(R, a), andt :- I x [0, |I|] is a 
Carleson cube. Moreover ?sfp ~ Tp. 
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The contrast between this statement and other theorems we have highlighted is the restriction of the 
intervals I to lie in a fixed dyadic grid, but compensated by the fact that the integrals are computed 
over triples of the corresponding interval. 

There is no fundamentally Hilbertian aspect to the proof, so the LP variant of the theorem holds. 
But, these details are left to the reader. 

8.1. Proof. 

8.1.1. Initial Steps. Assume that ct is restricted to some large dyadic interval lo, and that |x it restricted 
to SIq. This is suppressed in the notation, and we return to it at the construction of the principal cubes 
below. 

Take non-negative cj) € L^(3Io, p.), and consider the open sets Oi^ :- [¥*(^ > 2^} C K.+ . Take Z^ to 
be a Whitney decomposition of the set O^. Namely, an interval I G D is in Xi^ if and only if I is maximal 
subject to the conditions 31 C D.]c, but 51 <f. D.^. These collections have these properties. 

Disjoint Cover: D.]^ = Uiei ^' ^'^'^ ^^^ intervals I € T]^ are either equal or disjoint, aside from 

endpoints. 
Whitney Condition: 31 C O]^, but 51 (t O.^- 

Nested Property: If I G X^ and I' G 1^> with I c I', then 2^ > 2^'. 
Bounded Overlaps: For all k G Z, ^jgj 31 < Iq^ 
The bounded overlaps property requires explanation. For fixed k G N, suppose there are intervals 
1,1' G Xk. with |r| < 8|I|, but 31' n 31 ?t 0. Then, it follows that for one of the two children J of I, 
there holds 51' C 5J, hence 5J <f. O^, which contradicts the maximality of I. 
This is the important 

Lemma 8.2. [Maximum Principle] There is a constant C so that 

P*(ct)- (3t)^)(x) < CZ"", XG I, iGXk, kGZ. 

Proof. For z G 51 \ O^, and any y with (y, t) ^ 31, there holds 

Pt(x-y) < Cpt(z-y). 
Multiply this by ct)(i),t) • 31, and integrate with respect to ]x, to see that 

P*(ct) • 3t)(x) < CPXz) < CZ'^-^ . 



8.1.2. The First Estimate. For < 8 < 1 , we will show that 

(8.3) ||P*4)||<,<Tp||4)||^ + 5||P;l(t)||(, + remainder. 

The middle term can be absorbed into the left hand side, and the remainder, which is further split into 
two terms, is the core of the argument. 

Begin with the familiar formula, below, where m. is an integer related to the maximum principle. 



^;4>ii'. = Y. 



kez 



ak+^\Q- 



'l<i>r cT(dx) 



k+m\i'lk+m+1 



< Y_ 2^''CT(^k+m \ Hk+m+i : 



kez 
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= L 2"" L ""(I ^ ^^^+^ \ Hk+^+i )) 

kez leXic 

(8.4) :-^2^'^}^o-(Ffc(I]). 

kez leXk 
The notation in the last line reflects the fact that a given interval I can be in many collections Xi<, which 
fact we will have to confront below. 

Now, the sum below is restricted by < 6 < 1 : 



]^2'^ Y_ cT(Fk(I))<5||P;4)||^ 



kez leXk 

a(Fk(I))<6a(I] 

This is the middle term in (8.3). 

For I e T]^, with a(Fi^(I)) > 6o'(I], for each x G Fi^(I), it follows from the maximum principle that 

p;4)(x) = p;(4) • 3t](x) + p;(4) • {3t)^)(x) 

>P;(ct)-3t](x]-C2'^>p;(ct)-3t)(x), 

for ra G N sufficiently large, but absolute. 

In the estimate below, an integral over R, by duality is written as an integral over R^. In the latter 
space, we set li]^ := Uiex ^- The latter squares are disjoint. 

'^;((t)-3T){x)cT(dx) 



2^< 



o-(Fk(I)] 
1 

1 



i=k(i] 



31 



V(lF,(I))•4>^(dx,dt) 



3i-a 



^<,(lp^(i))-4)^(dx,dt] 



k+m 



+ 



1 



o-(Fk(I)) J 



'allFk(I)J- 



4) M.(dx, dt) 



J3tn^k+,^ 
=:A(k,I)+B(k,I). 

The first term is easy. By positivity and testing, and assuming that a(Fi^(I)) > 8a(I), 



A(k,I) < 8- 



1 



a(I) 



V(I)•ct)^(dx,dt) 



3i-nk^ 



< 8 



-1 



1 



^<.(I)IUI|l3t-n...*IU 



a(I)" "^^"-.5l-i2k+m 

Thus, we should estimate 



keN leXk 

(^(l=k(I])>6a(I] 



< S-^7i 



keN leXk 



A^^ ^(dx,dt) 



48 



MICHAEL T LACEY 

<&-''Ji\ml- 



The last line follows from assertion that 



iLLi 



st-^v 



< 1 



kez leXic 

But this is the direct consequence of the bounded overlaps property. 

The inequality (8.3) is established. To be explicit, the remainder term is as below; its analysis is the 
core of the argument. 



Y_ Y. B(k,I)V(Fk(I)) 



kez leXk 

(^(Fk(I]]>5a(I] 



(8.5) 



< 



I. L 

kez leXk 

a(Fk(I))>6(T(I) 



o-(I) 



iinciv 



V(lFk(i))-*^(dx,dt) 



8.1.3. The Principal Cubes and the Second Estimate. In (8.5), we dominate Pa(lF|^(i)) < Pal- It would 
be preferable that the latter function be basically constant on 31, but rather 

PaI(x,t)~^PJ(x,|I|), (x,t)G3t. 

Accordingly, set p,(dx, dt) = t^|j,(dx, dt). 

Using the nested property of the Whitney decomposition, write the region of integration using the 
collection T^^-^ra- 



3tn/lk 



^^(lp^(i))-4)H(dx,dt)= Y_ 



(8.6) 



JC3I 

JeXk+m 

JC3I 

Jelk+m 

JC3I 



^JlF,(I))•ct)^(dx,dt) 
'(jl • (j) M.(dx, dt) 



al)t 'p.(dx, dt) X - 



1 



f^rj) 



ct)t 'p,(dx, dt), 



This calculation suggests that we use the maximal function 

M^^\) := supt • eFi]; 
lev 



which is a bounded operator on L^(p.). 

Set the principal cubes, or intervals, as follows. Initialize Q to be {Iq}, where this is the large interval 
on which a is supported. In the inductive stage, for I € ^ minimal, add to Q those maximal dyadic 
children J such that a(J) :- Ei^4)t^^ > 10a(I). If there are no such children, for any minimal interval in 



Q, the construction stops. The maximal function bounds imply 

(8.7) Yoc{l]^^(i]<Ut- 
leg 






ml 
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The last equality follows by inspection. 

The estimate below controls part of the sum in (8.5). Appealing to (8.6), but also imposing a condition 
on the principal cubes, estimate as below. In the sum over I]c+Tn' 

IeG{I,I + |I|,I-|I|} 

is fixed. The union of these three intervals is 31. The bound below will be independent of the choice of 
Is- 



Z "(')-■[ Z 



(T(Fk(I))>6(T(I) 



j£2'k+m 
JCle, 7tgJ=7tg;Ie 



VI)t"V(dx,dt) X - 



fi(J) 



ct)t"V(dx, dt) 



T 



< Y. o-a)-V(7tgie)2[ Y. 



leXk 

o-(I)>6o-(I) 



JSlk+m 
JCle, ng]=nglg 



J)t-'il(dx,dt) 



^ Y_ CT(I)-I(x(7rgle 



l6Xk 
(j(Fk(I))>5cr(I) 



^*n(tIe)cT(dx) 



<7^ Y_ a(7TgIe)2|l(te 



leXk 

tT(Fk{I))>5a{I) 

Duality is used to pass to the testing condition for P*. 
Sum the term above over k G Z. 



(8.8) Y. L oc[nglem%) = Y°'^G^^Y_ Y. 

Geg kez leXk 

(^(Fk(I))>5a(I), 7reIe=G 



fi(te 



kez leXk 

(T(Fk(I))>MI) 



The difficulty with this sum is that a given interval I can be in many collections T]^, so that the term 
lJ.(Ie) can contribute to the sum many times. See however, this Lemma. 

Lemma 8.9. For any dyadic interval I, the set of integers below consists of at most 8^^ consecutive 
integers. 

{kGZ : iGXk, a(Ffc(I)) > 6a(I)} 

Proof. That the integers in the set are consecutive follows from the nested property of the collections 
Xfc. The sets Fi^(I) C I, with I fixed, are pairwise disjoint, as follows from the definition in (8.4). And 
each has ct measure at least 6a(I). n 



It then follows from the disjoint cover property and (8.7), that 



RHS(8.8) < 5-^ Y a(G)^^(6) < 
Geg 



This holds for each of the three possible choice of le, so completes this case. 
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8.1.4. The Third Estimate. It remains to bound the sum over k G Z of the expression below, in which 
the parents of le and J in the collection Q differ. 



Y_ ^w-' 


[ L 


leXk 


j6Xk+m 


'j(Fk{I))>5(T(I) 


JCle, TtgJCTtgle 



(jl)t"'p,(dx, dt) X - 



fi(J) 



c|)t"'p,(dx, dt) 



The term in brackets above is estimated by inserting powers of p,(J)='=^/^ and using Cauchy-Schwarz. 
This leads to, on the one hand, 

i2 



Y_ Hj)oc[ng]Y 



JCle, TteJCTtgle 
And, on the other hand, using Cauchy-Schwarz, inspection, and the testing condition, 

2 



L 



JCle, TTgjQTtgle 



! 



<,I)t-V(dx, dt) fi(t)-^ < Y. 



Jcle, TTgJgTtgle 
< 7lam . 



»aI)V(dx,dt) 



Putting these pieces together, it remains to estimate the sum 



(8.10) Y_ L 



Y_ fi(T)cx{7rgJ)^ 



kez leXk leXk+m 

<?(Fk(I])>6(T(I]JCle, TtgJCTrgle 

Again, a given interval J can occur many times in the sum above. But, in view of the following Lemma, 
the sum above is bounded by the expression in (8.7), completing the proof. 

Lemma 8.11. There is an absolute C so that for any G £ Q, the cardinality of the set below is at most 
C. 

{k : TTgJ = G , J G Iv+m. contributes to the kth sum in (8.10)} 

Proof. For G G ^, consider data 

1<^1 > k2 > • • • > kn , 

Jl C J2 C • • • C Jn , TtgJi = • • • = TTgJn = G , 

(Ii)e c (l2)e c ••• c (In)e, 

Jt eXkt+m, Jt C (It)e, It eXkt, ng]tQng[lt]^,, l<t<n. 

An upper bound on n is what is needed. 

All of the intervals 3It are overlapping. And some of the kt can be equal, but only a bounded number 
can be equal, by the bounded overlap property. Hence, after a deleting some data, and relabeling, the 
kt can be taken to be strictly decreasing. A number of the It can have the same length, but only one 
of three possible positions, since Ji C 3It for all t. From Lemma 8.9, there are a bounded number of 
such t that with It having a fixed length and position. Thus, after a further deletion and relabeling, the 
lengths |It| can be taken to be strictly increasing. But then, since It G T]^^ and Jt G T]ct+m.' if lu < n, 
for some t, there holds 

Jl C (It)e C Jn. 
This inclusion contradicts Ttgji = TTgJn, = G and Ttgji c 7rg(It)j,. The Lemma is established. n 
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8.2. Context and Discussion. 

8.2.1. This Theorem is in Saywer's paper [48]. The argument is as in that paper, with the addition of 
the combinatorial lemmas, Lemma 8.9 and 8.11, which appeared in [50]. One can follow the argument 
of [19], where all the essential difficulties are present. The fractional integral version of the two weight 
theorem has been broadly cited, but the use of the two weight inequality for the Poisson integral herein 
is the only one I am aware of. 

8.2.2. The underlying technique, a level set technique, based upon a maximum principle, has been used 
in a variety of contexts. An extension to singular integrals, is explored in [20]. Various simplifications 
for dyadic singular integrals are in [16]. And, vector-valued dyadic operators seemingly require the same 
technique [51]. 

8.2.3. For dyadic positive operators, there is a much simpler approach based upon the parallel corona 
of [23]. See the last page of [15], and [52]. 

9. Compact Operators 

We discuss the proof of Theorem 1.7. The reader might have suspect that the characterization would 
have been different, namely, the conditions would be phrased in terms of 'vanishing' conditions. This 
example however stands in the way. 

Take w = 5o be a point mass at zero. Then, take a to be absolutely continuous with respect to 
Lebesgue measure, with density equal to 

Notice that 

4 1 



iHwl^ do- 



dx ~ 1 . 

,, f I — ,, 1 / 





x(logx) 

And, L^(w) has but one dimension, thus, H^ is a bounded compact map from L^(w) to L^lc). It is 
clear that as A J, 0, 



-P(o-,[0,A))~ 



dx^ 

" 



(A + x)2(logx 
That is, there is no decay in the A2 ratio. 



, 1 



w([o,A))^. .. ,.. r^ X , r^ 1 



x(logx) 



dx ~ 1 



9.1. Necessity. Assume that H^ : L^(o") 1— > L-^(w) is bounded and compact, and by way of contradiction, 
that for a positive A, the collection X^. defined in (1.8) — (1.10), admits an infinite family {In : n G N} 
of pairwise disjoint intervals. Then, we can assume that each In satisfies just one of the three conditions 
(1.8)-(1.10). 

Say, that each In satisfies (1.9): Jj |H(jInp dw > Aa(In). But, then, the image under Hg- of the 

norm one functions In • o'(In)^^^'^ would have a lower bound on the norm of their images, and be pairwise 
orthogonal. Hence Hu cannot be compact. 

Say that each In satisfies (1.8), and moreover P(a, In) ^[ | > A. By examination of the proof of the 

necessity of the A2 condition, it follows that the image of the unit ball of L^(ct) under H^ has full norm 
on a countable number of disjoint sets in L^(w). Hence, H^ cannot be compact. 
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9.2. Sufficiency. Say that H^^ is X-compact if and only if there is a compact operator T\ so that 
||H(j — T\||l2((,.)^]_2(^] < A. The implied constant in this inequality will change, but is never more than an 
absolute constant. We need to verify that for each < A < 1, that H(j is A-compact. The argument is 
not not straight forward. Initial steps change the weights, to arrive at a situation in which the expected 
'vanishing' criteria reveals itself. Then, the sufficiency part of the main theorem is modified to prove the 
norm estimate above. 

9.2.1. A modification of ttie weights. The weights are modified, to fit the proof of the main theorem. 

Lemma 9.1. Suppose the pair of weights w, a meet the assumption of Theorem 1.7. Then, for each 
A > 0, there is a decomposition ofyv and a into a = ai + ^2, and likewise for w, so that 

(1) Both Gi and wz are finite sums of Dirac masses. 

(2) H(j2 is bounded (hence compact) from L^(o'2) to L^(w), and likewise forwj. 

(3) For the family of intervals Z^{a-\,w-\), as defined in Theorem 1.7 

(9.2) inf{|I| : I G XA(ai, wi)} > . 
Proof Suppose that 

(9.3) inf{|I| : I G Xa(o-, w)} = . 

Then consider {In} C Xx(a, w) with |In| decreasing to zero. If the limit inferior of these sets is empty, 
we violate the main assumption on the collection X^. Thus, the limit inferior is not empty, but it must 
be finite, else we obtain a second contradiction to the assumption on X;\(a,w). 

Thus, let xq be a point for which there intervals {In} C X^ with H^i ^n = l^o}. We claim that either 
a({xo}) or w({xo}) is not zero. (They can both be non-zero.) Suppose by way of contradiction, that both 
are zero. And, suppose that Jj |H[yIn|^ dw > Aa(In). Now, for each fixed n, we can select n' > n so 
that ail-n'] < l<y{ln), and 



iHalnl^ dw + 



i„' 



7 A 

iHaln'l dw < -0-(It 



Note in particular that it is necessary that neither w nor a have a point mass at xq for these conditions 
to hold. Hence, we see that fj , |H(r(In — In']P dw > ACT(In — In')- And, so we can build an infinite 
family of disjoint intervals in the collection XA(a,w). 

But, it could be that the intervals In only satisfy P(o', In) ^ | > A. The strategy is as before, but now 
we have to recall the proof of the necessity of the half-Poisson Aj condition. Fix interval In = (a, b). 
We can choose point a < a' < b, and integer n' > n so that w((a, a')), w([a',b)) > ^w(In), and In' 
is contained in one these two subintervals. For definiteness, let us assume that In' C (a, a'), and that, 
using the notation of (1.11), there holds 

-Halpi^l(Q',oo)jW ^ — ;y=f> a<x<a. 



We conclude that for a unit norm function fn G I-^(o'), there holds Jfim/)|Ho-fn|^ dw > A. And, since 
w(In') — > 0, there is a finite integer tlq, so that for all n' > no, 

iH^fnl^ dw> A. 

(a,a')\I,, 
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Hence, we can select an infinite sequence of disjoint intervals Ji^, and unit norm function f^ G L^lc) so 
that Jj |Ho-fi^|^ dw > A. Thus, the image of Hq. cannot be compact. 

Now, let xi^, 1 < k < K be the finite number of points for which there is a sequence of intervals in 
XaIC) w] which shrink down to that point xi<. Write a = CTi +0^2, where (Xj = Y.k=^ o'tt^kD^Xk- Use the 
similar notation for w. One, but not both, of the measures a2 and W2 can be zero. Now, for j, k = 1 , 2, 
H(j. maps L^(o"j) to wi^. And, as long as j,k are not both 1, the map is compact. 

Observe that Xa(o'i,wi] must /a/7 the condition (9.3), for otherwise, we would violate the construction. 
This completes the proof. 

D 

Therefore, it remains to show that H(j is A-compact, under the additional assumption that the collec- 
tion X;\(o", w) satisfies the estimate (9.2). Now, let us prove 

Lemma 9.4. Suppose the pair of weights w, a meet the assumption of Theorem 1.7, and for integers 
n, let (X-n := o"lR\[_-n.^], and likewise foryvn. Then, 

lim||H<,^||L2(<,^)^L2(wn) =0 

Proof Proceed by contradiction. Fix liminfn||H(y^||]_2((j^)^]_2(-^v^) > A > 0. Then, using our char- 
acterization of the boundedness of H^, there is an absolute constant c, so that for each integer n, 
there an interval I^ G If^^ia^yv], which does not intersect [— n,n]. Suppose this interval satisfies 
P(an., Inj ^u I > cA. Then, there is an integer n' > n for which 

Pfn- 1 T }]^]1^ ^ c. 

P(0-nl[-n',n'])^nJ ,. , > 2^^ ■ 

I -'■Til 

In this manner, we can construct an infinite family of disjoint intervals In G ^cX/ii^)'^)' completing the 
contradiction. n 

Thus, it remains to show that H^ is A-compact, under the additional assumptions that (1) the 
collection I)^[(y,w) satisfies the estimate (9.2), and we denote that infimum by rj > 0, and (2) that a 
and w are supported on an interval I*^ = [— N,N]. 

9.2.2. Modifying the Hilbert Transform. Modify the Hilbert transform as follows. Let 

Kl (y)l{y : a<|y|<2N} = -\y : a<|y|<2N} 

be supported on {y : ^ < |y| < 4N}, and satisfy |^Ki (y)| < y-^. Let Tjf(x) = jKi (x-y)f(y) a(dy). 
The image of the unit ball in L^(o") under J^ consists of continuous differentiable functions, with 

\-^Tli[x)\ <Ti-2 [|f(y)| da<^-'^[iy/\ 
ax J 

Thus, the image of the unit ball is compact in L^(w), since w is compactly supported. 

Let Ko(y) be the part of the Hilbert transform kernel that remains. Thus, Ko(y) is supported on 
^ < |y| < 4- agrees with ^ for < |y| < ^, and has derivative bounded by y^^ everywhere. Let 
T(jf = jKo(x — y)f(y) a(dy). It is this operator that we apply the proof of sufficiency to, with some 
modifications, in order to prove the estimate below, which completes the proof of the compactness of 
H(j, under the assumptions of Theorem 1.7, as A > is arbitrary. 



54 MICHAEL T LACEY 

Lemma 9.5. For all A > 0, with T as selected above, there holds \\Ta-f\\y^ < A||f ||(y. 

Let 2? be a randomly selected dyadic grid. Let n be the integer with 2"^ < ^ < 2^+V and consider 
the good projections, where we recall that the choice of < e < ^ and integer r are necessary to define 
goodness. 



p<^ f •- ^ 

n,good ' ■ /_ 



Bif-I+ Y_ ^i^- 

ieX':|i|=2'^ lev 

|I|<2'^,Iisgood 

Use the same notation for the corresponding projection on L^(w). One has 

Lemma 9.6. For all dyadic grids D, < e < 2- 3nd integers r, suppose that there holds 

KTPn,good) Pn,good9)w| ^ -^P llff II Sllw , 

then the same inequality holds without the good projections. 

The leading terms in the good projections is handled as follows. 
Lemma 9.7. Suppose that f = Higx)- |i|=2'^ ^i"'' ' ^- Then, 

KTfff, g)w| ^ '*^l|f||o-llg||w 

The dual estimate also holds. 

Proof. For each interval I G P with |I| = 2^"^, note that Tq-I equals H(jl on I. Hence, 

KT,(f • I), g)^| = Kf| -1(1,1,9 -21)1^ 



< 



i^f|{ 



|H(jI • g| dw + 



2I\I 



\Jal ■ g\ dwj 



Now, by construction, the first term is at most A|EJ^f|a(I)^/^||g • I||w, since A controls interval testing 
for all sufficiently small intervals I. This is clearly controlled. 
As for the second, note that it is at most 



ij'f • 



2I\I 



T^I • g dw < lEj'f I 



2I\I 



|H(yI • g| dw. 



The latter estimate is akin to the one in (6.8), the latter being a two weight Hardy inequality estimate. 
It is controlled by the Aj constant. But, in our present situation, the relevant Aj constant is controlled 
by A, so the proof is complete. 

D 

It therefore remains to consider the case where f is in the linear span of Haar functions h^ with 
|I| < 2^"^, and similarly for g. That is, in the proof of the main theorem, we are at the point of (4.1): 
It suffices to show that 



I Y. (TaAS^f,Afg)^|<A||f||,||g|U. 

I,J : IJC2I0 
|I|,IJI<2-'^ 

One needs this variant of the Monotonicity Principle. 
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Lemma 9.8. Suppose that y is a signed measure, and ]x is a positive measure with \x > \y\, both 
supported outside an interval I G V". Suppose that neither \x,y noryv has a point mass at an endpoint 
of I. Assume J C I, J is good with 2^|J| < |I|, where r is sufficiently large. There holds 

KTY,g)^|<P(J,^)(^,g)^. 

Here, g G L^(J, w), with yv-integral zero g = ^y\g{]')\y0^i, is a Haar multiplier applied to g. 

Proof. The only conditions needed for T^^ is that the first derivative of Kq is controlled. By linearity, it 
suffices to consider the case where g = h.^. Let xj be the center of J 



\{Ty,hYU < 



{Ty(x] -TY(xj]}hf (x] w(dx) 

Ko(x--y)-Ko(xj-y) Y(dy]h,j^(x] w(dx] 



^PKJ)(- 



-.KM\ 



Recall that (x — xj)hj^(x) > 0, so that the last line follows from standard kernel estimates. D 

One can then repeat the proof of sufficiency in the main theorem, using the stronger form of the 
monotonicity principle, specific to the Hilbert transform, to verify for instance, that the energy and func- 
tional energy inequalities hold with constant proportional to A. And, then use the modified monotonicity 
property, to apply them to T(j. There are many details to the proof that remain, but we do not present 
them here. 

10. The Proof under the Pivotal Assumption 

We prove an upper bound for a two weight inequality assuming a pivotal condition on a pair of weights. 
The set us is as follows. Let K(y) satisfies the size and gradient condition 

|x-y|-|VK(x,y)| + |K(x,y)| <\x-y\-' . 

Let 'Nj be the best constant in the inequality 



sup 

0<T<1 



K(x,y)f(y) a(dy) <NT||f||a. 
T<|x--y|<T-i w 

Let !P be the best constant in the pivotal inequality, defined as follows. For any interval Iq and any 
partition V of Iq into intervals such that neither a nor w have point masses at the endpoints, there 
holds 



(10.1) XP(cT,i)Vi)<a'^cT(io). 



leP 



We also require that the dual inequality, with the roles of w and a reversed, holds. 

Theorem 10.2. [Nazarov-Treil-Volberg [54]] Assume that the pair of weights w,^ do not share a 
common point mass, and satisfy the Az condition (1.4), and the pivotal conditions hold, namely J" < oo. 

Then, there holds Tsfy < Tx + Aj + 7, where T is the best constant in the inequalities 



|Tairw(dx) <TfCT(I), 



|T^I|^ a(dx) < Tfw(I) 



56 MICHAEL T LACEY 

We give the proof. This will highlight some of the difficulties that one must face in the general case. 
In addition, a quantitative higher dimensional version of this Theorem was key to [40]. We will use 
Calderon-Zygmund stopping data, to facilitate comparisons to the general case. This will also give an 
easier proof than is in [40,54]. 

10.1. The Global To Local Reduction. One need only prove that 

I \ 'o-r goodT) r good9/w| ^ ■^I|t||o-|| 9llw > 
1 /2 

where T :- 7j + Aj + J*. Thus, f and g can be assumed to be good functions, which is suppressed in 
notation. Goodness is essential, and we highlight those parts of the argument that use goodness. 
In analogy to (4.2), define 

B^b°-(f,g):= Y_ XEj;Aj^f.(TJj,A]rg)^, 

I:ICloJ:JaI 

and define B'^^'°™(f, g) similarly. Since Lemma 4.3 depends only on the A2 assumption, we have 
Lemma 10.3. There holds 

|(T,f, g)^ - B^^°-(f, g) - B'^^'°-(f, g)| < :K||f lUllglU . 

Thus, the main technical result is 

Lemma 10.4. There holds 

|B^^°-(f,g)|<T||f|U||g|U. 

The same inequality holds for B''^'°™(f, g). 

In analogy to Definition 4.5, we define 

Definition 10.5. Given any interval Iq, define J^pivotai(Io) to be the maximal subintervals I c Iq such 
that 

(10.6) P(cT-Io,I]^w(I) >10TV(I). 

There holds a(U{F : F e J"(Io)}) < t^o-(Io), by the pivotal inequality (10.1). 

We make the following construction for an f G LQ(Io,cr), the subspace of L^(Io,cr) of functions of 
mean zero. Add Iq to J^, and set af(Io) '■= 'S.f |f|. In the inductive stage, if F G J^ is minimal, add to J^ 
those maximal descendants F' of F such that either (a) Ep|f| > lOaf(F), or (b) F' G -7^pivotai(F). Then 
define 

,^,, faf(F) E^,|f| < lOaf(F) 
lEp|f| otherwise 

We continue to use the notations Ppf and Q^g. Observe that Lemma 4.6 continues to hold for this 
choice of Calderon-Zygmund stopping data. And, in particular, the quasi-orthogonality condition (4.9) 
holds. 

In analogy to Theorem 4.10, there holds 
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Lemma 10.7. [The Global to Local Reduction] There holds 



|B^^°-(f,g)-Br'^(f,g)|<T||f||,||g|U, 
where Bf'^'^lf, g) := Y_ B"'^°^"(P^f, Q^g) • 

and Q^ := ^p/ .^^21^/-,^ Q^, where recall that r is the integer associated with good intervals. 

The definition of B3^°^^(f, g) is at slight variance with the definition used in Theorem 4.10. 
Proof. The bound will follow from this estimate, with geometric decay. For all s > 2r 

\Y_ Y_ B^^°-(Pff,Qr,g)| <2-/4T||f|U||g|U. 

FeJ'F'eJ":7t3,F'=F 

(Of course this sort of estimate will not hold in absence of the pivotal condition.) It suffices to show 
that there holds for all F € J", 

Y_ B^'^°^"(Pff, Q^,g)| < 2-^/4tA • B , 

F'eJ":7t3,F'=F 



where A^ :- af (F)^a(F) + ^ af (F')^o-(F') , 



f'eT:n],¥'=f 



and B^:- Y. HQpSHw- 

F'eJ":7r5^F'=F 

Then, quasi-orthogonality completes the proof. 

We next make a hole in the argument of T^, using T to select how to make the hole. Let t = [s/2J , 
and write s = t + u. Fix F" with tt^-F" = F, and F' € J-" with tt^-F' = F". We concentrate on intervals 
I with Ttj-I = F, and J with both Ttj-J = F' and Ttj-Ij = F. The second condition is not assured, and 
we return to it at the end of the proof. In the expression Ef Aff • (T(jlj, Aj^g)w, the argument of T^ is 
written as Ij = (If — F") + F". With F" as the argument of T^, define real number ej by 

£jaf(F]:- Y. ^ij^i^- 
I -la] 

With the condition that the J^-parent of It is F, it follows that lejl < 1. And, we can write 



OfF' 



e{±}I:7T^I=7t^Ie=F F'eJ' J : 7r^J= 
7t^F'=F" J<sle 

T,F", Y_ L 'J^T9 

f'eT J:7i^J=F' 
7r^F'=F" J<Bl 

<Taf(F)o-(F")^/'[ Y. IIQp9l 



ij;Arf-(T,F",Afg)^ 



1/2 



F'eJ- 

7t^F'=F" 
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Then, observe that 



Y_ ^^^") < 5-^c^(T=) < 2-^a(F) , 



F"eJ" 

7t^F"=F 

as follows from the construction of J^. Therefore, by Cauchy-Schwarz, there holds 

1/2 



Y_ a)(F")<2-/4^.af(F)a(F)V2[ ^ ^ ||Q-g| 



F"eJ- 

7T^F"=F 



7t^F"=F7r^F'=F" 



(This part of the argument will work in the general case, but there is no reason that the complementary 
sum will also have geometric decay.) 

In the definition of W below, note that as t > r, r being the integer used in the definition of goodness, 
then the condition ttj-J = F' implies J d F". In this case, note that J is assumed to be strongly contained 
in F'. Apply the variant of the monotonicity principle Lemma 9.8, to see that 



V(F") 



L L 



Y_ Y. Er^Arf-a.dj-F'O.Afg), 



ee{±}I:7t^I=7t^Ie=F F'eJ' J:7i^J=F' 



<af(F) Y. p(f^(T^-T^"),r) Y. (F^'^D-i^fni 



reJH^'] 






where J'*{V') are the maximal intervals J <e F' with TtjrJ = F'. In the estimate below, we use Cauchy- 
Schwarz, and critically, the estimate of Lemma 6.1, with integer t, giving a geometric decay below. 

J:7r^J=F" 

Another application of Cauchy-Schwarz and an appeal to the pivotal condition (10.1) will show that 

1/2 



Y ^(F")<3'2-'i-^'^/2af(F)a(F)i/2[ Y t^H' 



7t^F"=F 



j:nU=¥' 



and the sum over F of this last expression is controlled by quasi-orthogonality. For < e < |, which we 
have assumed throughout, we have desired estimate. 

The argument has to this point concentrated on those pairs of intervals J ^ I with Ttj-Ij = ttj-I = F. 
It remains to prove the estimate below 

I Y Y EfA;^ipf-(TaF,A]^g) 

F6J':7T!pF=FJ:7t^-lj=F 
JgF 



< 2-'/^7 



Y «f(f')'o-(f'i 



1/2, 



F'6J':7T!pF'=F 



But this is the same method of proof, and so we omit the details. 



D 
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10.2. The Local Estimate. It remains to prove the following local estimate: 



IB 



above roc 



1/2 



Pff, g)| < T{cXf(F)o-(F)V2 + ^ ^ af(F')'o-(F')J + ||Pff||a} 



for then quasi-orthogonality will complete the bound on B3^°^^(f, g). 

In the bilinear form above, the argument of T(j is, for a pair of intervals J ^ I, Ij = (F — Ij) + F. 
Using linearity, and focusing on the argument of J^ being F, we can repeat the argument of (4.18), 
which depends upon the fact that the averages of f are controlled. There however is a difference in the 
argument that occurs due to our use of the Q™. Below, there is an additional requirement that Ij has 
J^-parent F. 



L L 



1/2 1 



^Arf.(T,F,Afg)^ <Taf(F)o-(F)'^^||g 



I:7r^I=FJ:J<aI,,7t^I,=F 

This bound follows the argument of (4.18), and we suppress the details. 

We need to address the Ij having a different J^-parent, which forces Ij to be in J^. Note that by 
interval testing, there holds 



F'e.F:7TlpF'=Fj:J<5F' 



<T Y. af(F')o-(F')^/'[_^ §(J)2 

J:JaF' 



1/2 



F'eJ':7t!pF'=F 



<T 



Y_ «f(f')'fT(F'] 



1/2, 



f'eT:n],f'=f 



?A^f.(T,(Io-Ij),Afg 



J y/w 



It therefore remains to consider the stopping form 
Br(f,g):- Y_ Y_ 

I:7t^I=FJ:J<5l,,7r^I,=F 

Lemma 10.8. For all F G J", there holds 

|BfP(f,g)|<T||f||,||g|U. 

Proof. This depends very much on the selection of stopping intervals. In fact there is geometric decay, 
holding the relative lengths of I and J fixed. Estimate for integers s > r. 



I:7t,^I=FJ:J«I,,7t^I,=F 
|I|=21JI 



ifArf-(T,(Io-Ij),Afg) 



< 



L L 



|f(i)i 



L 



a(Ie)V2 

1 : 7t^I=F ee{±} ■■ ^' J:j€le,7t^Ij=F 

|I|=21JI 



<m[ ^ f(ir 



1/2 



I : 7TjrI=F 



J:JCF 



Pfo-fF 



1/2 



Ie)J)(^,Hf)^|0(J)| 



where M := max 



1 



sup 



ee{±}i:„^ig=F ^(le 



Y_ P(cT(F-Ie),J)Mj). 



J : Jale ,7t^I,=F 
|Ih2^IJI 
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I 1 I 1 



H H H H H H H 



Figure 4. The approximates to the Cantor set C on the left, and on the right, the 
gaps, namely the components of [0, 1] — C . The intervals on the left are in /C, and those 
on the right are in Q. 

Here, we have used (a) used the bound |Ef Aff| < —ty^yui'' C^) appealed to the monotonicity principle 
Lemma 9.8; (c) used Cauchy-Schwarz, together with the fact that for J <£ F, there is a unique I containing 
it, with length 2^|J|. 

It remains to bound M, gaining a geometric decay in s, and appealing to the pivotal condition. Return 
to the inequality (5.2), to gain the geometric decay, 

Y_ P(c^(F - le), J)'w(J) < 2-(^-^'^P(a • F, Ie]'w(Ie) < 2-^'-'^"J'^^[le) , 

J:JgIe,7tjrI,=F 

|I|=21JI 

where the decisive point is that le has J^-parent F, hence it must fail the inequality (10.6). n 

11. Example Weights 

The sharpness of the different conditions in the main theorem is the subject of the this section. 

Theorem 11.1. There are pairs of weights (y,w, with no common point masses, that satisfy any one of 
these conditions. 

(1) The pair of weights satisfies the full Poisson Ai condition, but the norm inequality for the Hilbert 
transform (1.1) does not hold. 

(2) The pair of weights satisfies the full Poisson Aj condition, and the testing inequality (1.5), but 
the norm inequality for the Hilbert transform (1.1) does not hold. 

(3) The pair of weights satisfy the two weight norm inequality (1.1), but not the pivotal condition 
(3.10). 

11.1. The Initial Steps in the Main Construction. Let C = HJ^o Cn be the standard middle third 
Cantor set in the unit interval. Thus, Co = [0, 1], Ci = [0, j] U [|, 1], and more generally 

n 

Cn = U{[^'^ + 3"''] :x = ^ej3-', ejG{0,2}}. 

Let w be the standard uniform measure on C. Thus w(I) = 2^"^ on each component of Cn, n G Mq. 
This is phrased slightly differently. Let /C be the collection of components of all the sets Cn- Then, for 
each K G /C, there holds w(K) = |K|i^. 

The weight a will be a sum of point masses selected from the intervals in Q, taken to be the components 
of the open set [0,1] — C. (G is for 'gap.') Consider the Hw restricted an interval G G ^. This is a 
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smooth, monotone function, hence it has a unique zero Zq. Then, the weight a is 

c - ^ sg • 6zG ) 
Gee 
where sg > will be chosen momentarily, consistent with the A2 condition. A second measure is given 
by a' := ^Geg ^G • ^z' > where z'q is the unique point in G at which Hw(zq) = |G|^^^i^. 
The constants sq are be specified by the simple A2 ratio 

w(3G) a(G) tna-'-^ 

—— -—=2, that IS SG = |Grin3. 

\\j\ lo I 

To see this, note that 

w(3G) =w(G-|G|)+w(G + |G|) =2|G|t^, 

since G ± |G| are components of some Cn- With this definition, the basic facts about the w and a come 
from the geometry of the Cantor set and the the relations below, 

(11.2) w(I) <w(3I) < |I|i^, a(I) < |I|^"i^, listriadic. 

On the other hand, if I G ^ U/C, the inequalities above can be reversed, namely 

(11.3) w(3I)~|I|T^, a(I) ~ |I|^"tII, Ig^u/C. 

The properties of these measures that we are establishing are as follows. 

Lemma 11.4. For the measures just defined, there holds 

(1) The Hilbert transform H(j is bounded from L^la) to L^(w). 

(2) The Hilbert transform H^ is unbounded from L^(a') to L^(w), but the pair of weights satisfy 
the A2 condition, and the testing conditions 



sup a'{l) 

I an interval 



|H(j/I|^ dw < 00 , 



Concerning point 2, the unboundedness of H^ is direct from the construction of a'. 

(Hw)2da' = ^Hw(z^)V({za) 



In 2 '}(^ In 2 ^ \ , ^^ , In 2 

"In 3 =00. 



Gee 

(11.5) =^|G|2-^-2n-|M) = ^|G| 

Geg Gee 

There are exactly 2^^^ elements of Q of length 3^"^, proving the sum is infinite. 

11.2. The Poisson A2 Condition. 

Lemma 11.6. For either weight \x € {a, a'}, the pair of weights w, p. satisfy the A2 condition. 

Proof. It suffices to check the A2 condition on the the triadic intervals in the unit interval. Let us begin 
by showing that for any triadic interval I £ ICU Q, 

(11.7) P(a,I)<^, and P(w,I)<^. 

For then, the control of the simple A2 ratio will imply the control of the full A2 ratio. (For the inequality 
on w, the triple of the interval appears on the right, since w(I) can be zero if I G Q.) Now, it will be 
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clear that this argument is insensitive to the location of the points Zq and Zq, so the same argument for 
a will work equally well for a'. 

Let us consider a. Using (11.3), there holds 



' ' k=l 



3H\3^ 



,l(|Ip+dist(x,I))2 



a(dx) 



~ III ^3^13^11 

k=l 

o-(I) 



< ^il^ + ^3-k|3ki|i-i^ < ^ii^^3-k{M < ^ 



_ln2 a(I) V— -_vln2 a(I] 

ln3 < — — - \ 2 1^.--^ ^ 

k=l ' ' k=0 

Turning to the weight w, one has 

HI 



Pf n ^ ^(31) , f 

f^lW,iJ_ 1^1 +A^J^^^^^^_,^(|i|2 + dist(x,I))2 



k=2 

oo 



w(dx) 



< !^:i^ + f ^(^'^^j 



HI 



k=2 

oo 



3'^|31'I| 



' ' k=2 



icTi-i+Jii^ w(3I) 

'^Tl l + ln3 < 



f^^ 



-m-M) , w(3I) 



uriJ < 



k=l 



The A2 product P(a, I) • P(w, I) has been bounded for I G /C U t/. Suppose that I is a triadic interval 
that is not in these two collections. Then, there is a smallest integer k such that one of l'^' — 3^ , l'*^', 
or l'*^' + 3*^ are in ICU Q. Here, l'^^ denotes the k-fold parent of I in the triadic grid. It follows that 
neither a nor w assign positive mass to the interval 31'^^^', and so, it follows that 



P(a,I) 



HI 



-kr 



[ojixsic^-'i (HI +dist(x, I))2 
and likewise for w. Hence, the full A2 condition follows 



a[dx) ~3"^P(a,I 



rM> 



D 



There is the following corollary, relevant to the question of Sarason, and Nazarov's counterexample 
to the same. Combine the lemma above and (11.5) to see that 

Corollary 11.8. The pair of weights [(y',w) satisfy the full Poisson Aj condition, but H^ is unbounded 

fromL^iw) toL^{(y'). 

11.3. The Testing Conditions. We turn to the testing conditions, using in an essential way the precise 
definition of the weight a: it gives a huge cancellation, which simplifies things considerably. 

Lemma 11.9. For any interval I, there holds 

|HwI|^ da<w(I). 
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Proof. By construction of cr, there are two reductions. The first is simple, namely that the two endpoints 
of the interval I can be taken to be an endpoint of an interval in Q. The second comes from the 
construction of o": Hw = 0, relative to da measure. Hence, 



iHwIP da: 



H,,([0,1]-I)2da, 



namely the complement of I is the argument of the Hilbert transform on the right. 

Then, one abandons all further cancellations. Let us show that for all intervals K G /C (the components 
of the sets Cn which generate the Cantor set). 



(11.10) 



iH^KrtP da<w(K), 



where Krt is the right component of [0, 1] \K. The same estimate holds for the left component, and this 
completes the proof. For, if we set Irt to be the right component of [0, 1] \I, and take K^ K^, . . . , to be 
the maximal intervals in K contained in I, there holds 



fHwIr 



da< ^ 



oo 



ni2 



K'^ 



(H^O" da 



<^w(K'') <w(I). 



n=l 



Now, for K G /C, let Ki , K2, . . . , be the maximal intervals in /C that lie to the right of K. Arranging 
them in increasing length, note that the length of Ki is either |K| or 3|K|. For n > 2, the length of Kn 
increases by a factor of 3, and dist(K, Kn) > |Kn|, and hence there are at most 1 — log3|K| such intervals 
in /C. Here is an illustration: 



K Ki 



K2 



K3 



Then, one has the estimate below, where the sum is of a decreasing geometric series, estimated by 
its first term. 

^ |K„I 



IHm.K 



■wi^rtl 



n=l 



|K| 



Hence, (11.10) follows from the control of the Aj ratio. 



An important part of the remaining arguments is that points Zq, and Zq cannot cluster close to the 
boundary of G. 

Lemma 11.11. There is a constant < c < ^ such that 

\zg-Zq\ < c|G|. 

Proof. Estimate Hw at the midpoint Zq of a component G. By symmetry of the Hilbert transform, and 
the Cantor set, it always holds that H[v^1^q){zq] = 0, so that appealing to (11.2), 

|HW(Z;;)|=|H(W1,3G,.)(Z;;)| 



k=2 



w(3'^G] 

|3i^G| 
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X|3'Gr^+i^<|G| 



' ^ 111 3 



k=2 



Next, we turn to a derivative calculation. The function Hw, restricted to G is a smooth function, one 
that diverges at the end points of G at a rate that reflect the fractal dimension of G. For any x € G 
note that 



— Hw(x) 
dx 



> ^^^^ ^ IGI-^^ 



In 2 
In 3 



This is a uniform lower bound, and in fact the lower bound is very poor at the boundaries of G. Indeed, 



dx 



Hw(x) >dist(x, 9G] 



It follows that we have to have \zq — z'q\ < c|G|, for some < c < ^^ That is, one need only move at 
fixed small multiple of |G|, passing from the location of the zero zq to the point z'q. 

a 



The second half of the testing intervals inequalities is as follows. 
Lemma 11.12. For |x G {a, a'}, and any interval I, 



(11.13) 



|H^,I|^ dw < |x(I) 



Proof. For the sake of specificity, let |j. = cr. Indeed, by Lemma 11.11, the same argument will work for 



a' 



To fix ideas, let us assume that I G /C. Write the left, middle and right thirds of I as I_i,Io,Ii, 
respectively. Then, note that 



H^(I)^ dw 



H^ir dw 



1-1 Uli 



(11.14) 
(11.15) 



H<,(Io)^dw + 



I-lUl, 



+ 



H<,(I_i)^dw + 



H<,(Io + Ii)^dw + 



H„(Iir dw. 



H<,(I_i + lo)^ dw 



i-i 



The first term on the right is simple. On the interval Iq, ct is a point mass, at a point that is at distance 
> c|I| from I-ti. Thus, by (11.3), 



I-rUl, 



, ^^ 1 l°3 , , ln2 

H^(Io)^ dw < ' ' |I|i^ ~ a{l) 



That completes the first integral. The remaining two integrals in (11.14) are handled by a similar 
argument. 
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Concerning the two integrals in (11.15), one should note that I-ti G /C and that o"(I-|-i ) < 3 ^+^in3 a(I). 
This geometric factor is smaller than j, therefore one can recurse on (11.14) and (11.15) to see that 



(11.16) 



Ha(K)^ dw < a(K) , Kg/C. 



For a general interval I, since a is a sum of Dirac masses, we can assume that the interval I is in a 
canonical form. Namely, each endpoint of I can be assumed to be an endpoint of an interval in Q. The 
basic inequality is 



(11.17) Y_ 



Ke/Ci 



|H^(I-K)|2dw<o-(I), 



where /Ci is the maximal elements of /C contained in I. The integration is over K, and the argument of 
the Hilbert transform is I — K. 

To see that (11.17) implies the Lemma, note that by (11.16), 



H^(I)^ dw 






n^iif dw 

H^[l-Kf dw+ Y_ 

KG/Ci 



H'^(K]2 dw 



kg/Ct 



<ct(I)+ Y_ ^(l^) ^o-(I). 
In fact, (11.17) follows from 
(11.18) 



|Ha(I 



K)f dw < ^}^w{K) , Kg/Ci. 



JK |I| 

For this is summed over K G /Cj, and then one uses the Aj property. 

To prove (11.18), all hope of cancellation is abandoned. For an interval K G /Cj, let us consider 
component Irt of I — K which lies to the right of K. It has a Whitney like decomposition into a finite 
sequence of intervals Ji, . . . , Jt that we construct now. These intervals will have the property that they 
are (a) pairwise disjoint, (b) their union is I^t, (c) and dist(K,supp(aJs)) > |Js| > 32|K|, for all 1 < s < t. 

Now, Ji = K + |K| G Q. If this interval is not contained in I, it follows that K contains the right hand 
endpoint of I, and there is nothing to prove. Assuming that Ji C I, the inductive step is this. Given 
Ji,...,Js, as above, whose union is not Irt 

(1) If Js G G, then Js + |Js| G K.- If this interval is contained in I^t, then we take Js+i = Js + iJsl ^ ^i 
and repeat the recursion. Otherwise, we update Js := Irt — Uu=i Jt' ^"^^ the recursion stops. 

(2) If Js G /C, then it follows that Js-i G G, and the element of G immediately to the right of Js 
is 3(Js + 6|Js|]. If this interval is contained in Irt, then we take Js+i = 3(Js + 6|Js|) G G, and 
repeat the recursion. Otherwise, we update Js := Irt — Uu=i Jt' ^'^^ the recursion stops. 

With this construction, it follows that 



|H^(Irt]-K|<^ 



0-(Js 



u=l 



B 

Tl=l 



l_ln2 a[l] 
m 3 <. 
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This proves the 'right half of (11.18), that is, when the argument of the Hilbert transform is Irt- The 
'left half is the same, so the proof is complete. n 

At this point, we have proven that the pair of weights (\v,(y') satisfy the full Poisson A2 condition, 
and the testing condition (11.13). But, ||Hw||l2((^;) is infinite, by (11.5). Hence, points (1) and (2) of 
Theorem 11.1 are shown. 

We have also shown that the pair of weights (w, ct) satisfy the full Poisson Aj condition, and both 
sets of testing conditions. Hence, Hw is bounded from L^(w) to L^(o'). This pair of weights also fail 
the pivotal condition of Nazarov-Treil-Volberg [33]. This is verified by observing that the collection Q is 
a partition of [0, 1], and 

XP(w,G)MG):.X^^a(I) 
~ ^w(3G) 



Geg 



|G|ln 



ln2 

3 = 00 



Ge£? 

since Q contains 1^ intervals of length 3^"^, for all integers n. Here, we have used (11.7), followed by 
(11.2). Since inf^gG Mw(x) > P(w, G), this also shows that the maximal function M is not bounded 
from L^(w) to L^(o-). 

Notice in contrast that the energy inequality for the partition Q is trivial, since a restricted to any 
interval G is a point mass, hence E(a, G) = 0, for all G G ^. 

11.4. Context and Discussion. 

11.4.1. Counterexamples were an important source of inspiration on these questions. The early paper 
of Muckenhoupt and Wheeden includes an example of the fact that the simple A2 condition below is 
not sufficient for the two weight inequality. 

w(I)a(I) 
sup — -— < 00 . 

I m m 

For instance, the boundedness of the simple A2 ratio is simple to check for the pair w = 80, and 

a(d,x) =xl[Ooo)d,x. Then, one sees that for f = ^1[i,l]. 

VlogL- ||f||<j«;logL~ ||H<jf||^, L>1. 

Thus, the Hilbert transform is unbounded. And, one can directly see that the half-Poisson A2 condition 
fails. 

Much harder, is the fact that the Poisson A2 condition is not sufficient. This was the contribution of 
Nazarov [29]. This example lead to the conjecture of Nazarov-Treil-Volberg. A more delicate example, 
of a pair of weights which satisfied the Poisson A2 condition, and one set of testing conditions, say (1.5), 
but not the norm inequality was that of Nazarov- Vol berg [35]. Both of these examples were indirect. 

The Nazarov counterexample was also used to disprove a conjecture about similarity to a normal 
operator, as shown by Nikolski-Treil [37]. 
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11.4.2. The example given here is directly inspired by a Cantor set type example in Sawyer's two weight 
maximal function paper [47]. It is drawn from [21], with the purpose to show that the pivotal condition 
of Nazarov-Treil-Volberg [33,54] was not necessary for the two weight inequality to hold. This was an 
explicit example, and also pointed to the primary role of the notion of energy. It is very interesting and 
delicate, in that the point masses have to be placed on the zeros of the Hilbert transform, in order to 
obtain the boundedness of the transform. It is also humbling in that it still does not reveal how delicate 
the proof of the sufficiency in the main theorem needs to be. 

11.4.3. It is subtle example of Maria Carmen Reguera [42] and Reguera-Thiele [44] that proves this, as 
is pointed out by Reguera-Scurry [43]. 

Theorem H. There is a pair of weights for which the maximal function M^ is bounded from L^(cr) — > 
L^(w) and M^ is bounded from L^(w) — ) L^(a), but norm inequality for the Hilbert transform (1.1) 
does not hold. 

This is quite a bit more intricate than the examples we have presented. It had been suggested, in the 
early days of the weighted theory, that the boundedness of the maximal functions would be sufficient 
for the norm boundedness of the Hilbert transform. On the other hand, if one considers 'off-diagonal' 
estimates, then boundedness of the maximal function is sufficient for norm inequalities for singular 
integrals [8]. 

12. Applications of the Main Inequality 

The interest in the two weight problem stems from a range of potential applications arising in so- 
phisticated arenas of complex function and spectral theory. The motivations for these questions are 
complicated, and based upon subtle theories. The connections to the two weight Hilbert transform is 
not always immediate, and the properties of interest are frequently more intricate than those of mere 
boundedness of a transform. Nevertheless, the acknowledged experts of [3] write ". . .we have found it 
both useful and conceptually appealing to transform the subject into a study of the mapping properties 
of discrete Hilbert transforms. We have learned to appreciate that the essential difficulties thus seem 
to appear in a more succinct form." A brief guide to the subjects, and some of the essential difficulties 
follow. 

12.1. Sarason's Question on Toeplitz Operators. This question arose from Sarason's work on exposed 
points of H^ [46]. Indeed, this was part of an influential body of work that pointed to the distinguished 
role of de Branges spaces in the subject. This paper contains examples of pairs of functions f, g, for 
which the individual Toeplitz operators where unbounded, but the composition bounded. 



Question 12.1 (Sarason [45]). Characterize those pairs of outer functions g,h. G H^ for which the 
composition of Toeplitz operators TgTpj- is bounded on H^. 

Following [45], for a function h € L^(T), the Toeplitz operator Th can be thought of as taking f G H^ 
to the space of analytic functions by the definition 

1 



THf(z) := ^ 
2n 



f(e^'^)H(e^^)k,(e^e)d0, 

3D 



where ICwlz] :- - — \t^ — is the reproducing kernel. 
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H^ 



M. 



L^dhl 



-21 



TgT^ 



H^ 



M, 



H^(|g| 



Figure 5. Sarason's Question concerns the top line of the diagram, which is equivalent 
to the lower part of the diagram. The down arrow on the left, labeled M-pj-, is an isometry 
on the range, while the up arrow on the right, labeled Mg is an isometry between the 
two spaces. 

Also in [45] is an argument of S. Treil that a Poisson A2 condition is necessary condition for the 
boundedness of the composition: 

supP|f|^(z)P|g|^(z) < 00, 

zeD 

where P denotes the Poisson extension to the unit disk. Sarason wrote that 'It is tempting to conjecture 
that the last condition is also sufficient for the boundedness of TgT-^.' This statement, widely referred to 
as the Sarason Conjecture, is of interest in both the Hardy and Bergman space settings. (In the latter 
case, see the striking new results of [2].) 

The connection with the two weight problem for the Hilbert transform is indicated by the diagram 
from [7, §5], see Figure 5. In the diagram, M-^ is multiplication by h and P+ is the Riesz projection 
from L^ to H^. The boundedness is equivalent to 



MgP+M^ 



H^H^H^ 



The structure of outer functions leads to these simplifications. Since the product of analytic is analytic, 
the second H^ above can be replaced by L^, and then, the outside multiplication Mg can then be replaced 
by M|g|. Thus, we are considering M|g|P+M-^ : H^ h-> L^. Now, f is anti-analytic, so we can replace 
H^ above by L^. Moreover, the multiplication operator Mf/|f| is unitary, since an outer function can be 
equal to zero on T only on a set of measure zero. Thus, it is equivalent to consider 



M|g|P+M|f| 



L^^L^. 



This is a two weight inequality for P+.^ 

The Riesz projection is a linear combination of the identity and the Hilbert transform, and our main 
theorem will apply to it. Note that the inequality 



l|P+(|f|4))|lL2(|g|2d.] < 

is equivalent to 



\^\\l2 



(dx) 



'+(|f| 4')||L2(|g|2dx) ^ ll'4^llL2{|f|2dx) 



^Sergei Treil helped us with the history of this question. 
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Recall that P+ = I — ?H, according to how we defined the Hilbert transform, where I represents the 
identity operator. In the two weight setting, we interpret the norm inequality HP+lfffJUw ^ ||f||(T. as 
uniform over all truncations < t < 1 defined by 



P+,^(af) :=af + 



n 



nyy 

T<|x— vii<T-' y 



Theorem 12.2. For pairs of weights w, a that absolutely continuous with resepct to Lebesgue measure, 
the norm inequality ||P+(CTf)||w $ ||f||cT holds if and only if the pair of weights satisfy the Poisson A2 
condition (1.4), and these testing inequalities hold, uniformly over all intervals I, for a finite positive 
constant 7, 



|P+(ali)|2w(dx]<a'2a(I), 



|P+(wli)|2 o-(dx) < 7^w[l] 



One must be sure that the A2 inequality is necessary from the norm inequality. As it suffices to test 
real-valued functions, the real-variable proof given here will suffice. This in particular shows that for the 
densities of the weights, ct(x] • w(x] < Ai, for a.e.x. Thus, the identity part of the norm, and testing, 
inequalities are trivial. The remaining parts just concern the Hilbert transform, so one can use the main 
result. 

If one is interested in the Sarason question for functions f , gg that are not outer, there is no simple 
reduction to the two weight inequality for the Hilbert transform, and the problem is quite subtle, as the 
role of the multiplier P+Mj is more involved than that of just a weight. 



12.2. Model Spaces. For a probability measure a on T, define a holomorphic function 9 on 
Poisson integral 



by the 



1 



1-efzi 



1 



1-zC 



= o-(dO 



This is an inner function: A holomorphic map of D to itself which is unimodular a.e. on T. Also, 9(0) = 1 . 
(The measure a is a Clark measure for 9, frequently written as fS-\.) 

The shift operator Sf(z) = zf(z) on H^ has invariant subspace 9H^ = {9f : f € H^}, whence 
Ke := H^ 9H^ is invariant for S*. Beurling's theorem states that every invariant subspace for S* is 
of this form. The model operator is Se := PeS, where Pe is the orthogonal projection from H^ onto 
Ke. Remarkably, subject to mild conditions, every contractive operator on a Hilbert space is unitarily 
equivalent to a properly chosen So. For this, and other reasons, properties of the Ke spaces have broad 
significance. 

The spaces Ke and L^(cr) are unitarily equivalent, with the unitary map from f G L^(o') to F G Ke 
given by 



F(z) = (l-9(z)) 



f(C)_ 
1-zC 



o-(dO. 



One is interested in those measures |j, on T for which the natural embedding operator is bounded from 
Ke to L^(|x), namely, is it the case that ||F|L < ||F||Ke. We see that this bound is equivalent to 



_m___ 

l-zC 



cT(dC) |1 -9(z)|V(dz) ^ l|f|| 



That is, the question is equivalent to a two weight inequality for the Hilbert transform on 
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From this perspective, one can lift counterexamples concerning the two weight Hilbert transform to 
those for embedding operators, which is the tactic of [35], from which we have taken this condensed 
presentation. A characterization of the embedding question can be read off from our main theorem. 

This subject is profound. The model spaces are also important to spectral theory, and the subject 
of rank one perturbations of a unitary operator. In spectral theory, it is important to understand the 
structure of the unitary operator that sends the Hilbert space to into L^ of the spectral measure. Weighted 
Hilbert transforms arise therein. See for instance [37], which uses the example of Nazarov showing that 
the A2 condition is not sufficient for the boundedness of the Hilbert transform. Also see [26]. 

We point the interested readers to [36,41], and the many citations therein for more information about 
these subjects. 

12.3. de Branges Spaces. We recall the setting of [3,4]. For a sequence of distinct points P = {yn} C C 
and a sequence of positive numbers v = {vn} consider the Cauchy transform 

anVn 



H(r,v] : a = {an}^ ^ 



This is well defined for a G £^ and z G O, defined by 
a := |z G C : Y , ^"^ ,, < 00) . 

n:z#yn ' ' ' 

Call ?^(r, v) the space of functions analytic on Q given by the image of £^ under Hfr^)- For appropriate 
choices of (r,v), these Hilbert spaces have deep connections to analytic function spaces. For instance, 
the reproducing kernels of 'H(r,v) are 

4" (^-yn)(C-Tn) ' 

And, many natural questions, such as the structure of frames of reproducing kernels for TiiV^v], require 
knowledge about the two weight inequality for the Cauchy transform. For instance, the main real-variable 
result in [3] is a characterization of a two weight inequality, but under the requirement that both measures 
be a sum of point masses on sparse collections of points. This yields interesting results in the setting of 
de Branges spaces. 

The definition of 'H(r,v) provides just one possible representation of a de Branges space, a class of 
Hilbert spaces with remarkable properties. The standard reference for them is [9]. Beginning from the 
works of Sarason [46], they have become an essential part of subject of analytic function spaces. 
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